Expressions
Standard Expressions are basic level operations that can be added across the platform such as finding the max value in a column, extracting the year from a date field, or removing the leading zeroes in a text field.
Scalar Functions
Aggregate Functions
AI Functions
- AI Functions Leverage AI and machine learning capabilities (Done)
Specialized Functions
System and Table Functions
Other Functions
1 - Aggregate Functions
Aggregate functions are essential tools in SQL that allow you to perform calculations on a set of values and return a single result.
These functions help you extract and summarize data from databases to gain valuable insights.
Function Name | What It Does |
---|
ANY | Checks if any row meets the specified condition |
APPROX_COUNT_DISTINCT | Estimates the number of distinct values with HyperLogLog |
ARG_MAX | Finds the arg value for the maximum val value |
ARG_MIN | Finds the arg value for the minimum val value |
AVG_IF | Calculates the average for rows meeting a condition |
ARRAY_AGG | Converts all the values of a column to an Array |
AVG | Calculates the average value of a specific column |
COUNT_DISTINCT | Counts the number of distinct values in a column |
COUNT_IF | Counts rows meeting a specified condition |
COUNT | Counts the number of rows that meet certain criteria |
COVAR_POP | Returns the population covariance of a set of number pairs |
COVAR_SAMP | Returns the sample covariance of a set of number pairs |
GROUP_ARRAY_MOVING_AVG | Returns an array with elements calculates the moving average of input values |
GROUP_ARRAY_MOVING_SUM | Returns an array with elements calculates the moving sum of input values |
KURTOSIS | Calculates the excess kurtosis of a set of values |
MAX_IF | Finds the maximum value for rows meeting a condition |
MAX | Finds the largest value in a specific column |
MEDIAN | Calculates the median value of a specific column |
MEDIAN_TDIGEST | Calculates the median value of a specific column using t-digest algorithm |
MIN_IF | Finds the minimum value for rows meeting a condition |
MIN | Finds the smallest value in a specific column |
QUANTILE_CONT | Calculates the interpolated quantile for a specific column |
QUANTILE_DISC | Calculates the quantile for a specific column |
QUANTILE_TDIGEST | Calculates the quantile using t-digest algorithm |
QUANTILE_TDIGEST_WEIGHTED | Calculates the quantile with weighted using t-digest algorithm |
RETENTION | Calculates retention for a set of events |
SKEWNESS | Calculates the skewness of a set of values |
STDDEV_POP | Calculates the population standard deviation of a column |
STDDEV_SAMP | Calculates the sample standard deviation of a column |
STRING_AGG | Converts all the non-NULL values to String, separated by the delimiter |
SUM_IF | Adds up the values meeting a condition of a specific column |
SUM | Adds up the values of a specific column |
WINDOW_FUNNEL | Analyzes user behavior in a time-ordered sequence of events |
1.1 - ANY
Aggregate function.
The ANY() function selects the first encountered (non-NULL) value, unless all rows have NULL values in that column. The query can be executed in any order and even in a different order each time, so the result of this function is indeterminate. To get a determinate result, you can use the ‘min’ or ‘max’ function instead of ‘any’.
Analyze Syntax
Analyze Examples
func.any(table.product_name).alias('any_product_name')
| any_product_name |
|------------------|
| Laptop |
SQL Syntax
Arguments
Arguments | Description |
---|
<expr> | Any expression |
Return Type
The first encountered (non-NULL) value, in the type of the value. If all values are NULL, the return value is NULL.
SQL Examples
Create a Table and Insert Sample Data
CREATE TABLE product_data (
id INT,
product_name VARCHAR NULL,
price FLOAT NULL
);
INSERT INTO product_data (id, product_name, price)
VALUES (1, 'Laptop', 1000),
(2, NULL, 800),
(3, 'Keyboard', NULL),
(4, 'Mouse', 25),
(5, 'Monitor', 150);
Query Demo: Retrieve the First Encountered Non-NULL Product Name
SELECT ANY(product_name) AS any_product_name
FROM product_data;
Result
| any_product_name |
|------------------|
| Laptop |
1.2 - APPROX_COUNT_DISTINCT
Estimates the number of distinct values in a data set with the HyperLogLog algorithm.
The HyperLogLog algorithm provides an approximation of the number of unique elements using little memory and time. Consider using this function when dealing with large data sets where an estimated result can be accepted. In exchange for some accuracy, this is a fast and efficient method of returning distinct counts.
To get an accurate result, use COUNT_DISTINCT. See Examples for more explanations.
Analyze Syntax
func.approx_count_distinct(<expr>)
Analyze Examples
func.approx_count_distinct(table.user_id).alias('approx_distinct_user_count')
| approx_distinct_user_count |
|----------------------------|
| 4 |
SQL Syntax
APPROX_COUNT_DISTINCT(<expr>)
Return Type
Integer.
SQL Examples
Create a Table and Insert Sample Data
CREATE TABLE user_events (
id INT,
user_id INT,
event_name VARCHAR
);
INSERT INTO user_events (id, user_id, event_name)
VALUES (1, 1, 'Login'),
(2, 2, 'Login'),
(3, 3, 'Login'),
(4, 1, 'Logout'),
(5, 2, 'Logout'),
(6, 4, 'Login'),
(7, 1, 'Login');
Query Demo: Estimate the Number of Distinct User IDs
SELECT APPROX_COUNT_DISTINCT(user_id) AS approx_distinct_user_count
FROM user_events;
Result
| approx_distinct_user_count |
|----------------------------|
| 4 |
1.3 - ARG_MAX
Calculates the arg
value for a maximum val
value. If there are several values of arg
for maximum values of val
, returns the first of these values encountered.
Analyze Syntax
Analyze Examples
func.arg_max(table.product, table.price).alias('max_price_product')
| max_price_product |
| ----------------- |
| Product C |
SQL Syntax
Arguments
Return Type
arg
value that corresponds to maximum val
value.
matches arg
type.
SQL Examples
Creating a Table and Inserting Sample Data
Let's create a table named "sales" and insert some sample data:
CREATE TABLE sales (
id INTEGER,
product VARCHAR(50),
price FLOAT
);
INSERT INTO sales (id, product, price)
VALUES (1, 'Product A', 10.5),
(2, 'Product B', 20.75),
(3, 'Product C', 30.0),
(4, 'Product D', 15.25),
(5, 'Product E', 25.5);
Query: Using ARG_MAX() Function
Now, let's use the ARG_MAX() function to find the product that has the maximum price:
SELECT ARG_MAX(product, price) AS max_price_product
FROM sales;
The result should look like this:
| max_price_product |
| ----------------- |
| Product C |
1.4 - ARG_MIN
Calculates the arg
value for a minimum val
value. If there are several different values of arg
for minimum values of val
, returns the first of these values encountered.
Analyze Syntax
Analyze Examples
func.arg_min(table.name, table.score).alias('student_name')
| student_name |
|--------------|
| Charlie |
SQL Syntax
Arguments
Return Type
arg
value that corresponds to minimum val
value.
matches arg
type.
SQL Examples
Let's create a table students with columns id, name, and score, and insert some data:
CREATE TABLE students (
id INT,
name VARCHAR,
score INT
);
INSERT INTO students (id, name, score) VALUES
(1, 'Alice', 80),
(2, 'Bob', 75),
(3, 'Charlie', 90),
(4, 'Dave', 80);
Now, we can use ARG_MIN to find the name of the student with the lowest score:
SELECT ARG_MIN(name, score) AS student_name
FROM students;
Result:
| student_name |
|--------------|
| Charlie |
1.5 - ARRAY_AGG
The ARRAY_AGG function (also known by its alias LIST) transforms all the values, including NULL, of a specific column in a query result into an array.
Analyze Syntax
Analyze Examples
table.movie_title, func.array_agg(table.rating).alias('ratings')
| movie_title | ratings |
|-------------|------------|
| Inception | [5, 4, 5] |
SQL Syntax
ARRAY_AGG(<expr>)
LIST(<expr>)
Arguments
Arguments | Description |
---|
<expr> | Any expression |
Return Type
Returns an Array with elements that are of the same type as the original data.
SQL Examples
This example demonstrates how the ARRAY_AGG function can be used to aggregate and present data in a convenient array format:
-- Create a table and insert sample data
CREATE TABLE movie_ratings (
id INT,
movie_title VARCHAR,
user_id INT,
rating INT
);
INSERT INTO movie_ratings (id, movie_title, user_id, rating)
VALUES (1, 'Inception', 1, 5),
(2, 'Inception', 2, 4),
(3, 'Inception', 3, 5),
(4, 'Interstellar', 1, 4),
(5, 'Interstellar', 2, 3);
-- List all ratings for Inception in an array
SELECT movie_title, ARRAY_AGG(rating) AS ratings
FROM movie_ratings
WHERE movie_title = 'Inception'
GROUP BY movie_title;
| movie_title | ratings |
|-------------|------------|
| Inception | [5, 4, 5] |
1.6 - AVG
Aggregate function.
The AVG() function returns the average value of an expression.
Note: NULL values are not counted.
Analyze Syntax
Analyze Examples
func.avg(table.price).alias('avg_price')
| avg_price |
| --------- |
| 20.4 |
SQL Syntax
Arguments
Arguments | Description |
---|
<expr> | Any numerical expression |
Return Type
double
SQL Examples
Creating a Table and Inserting Sample Data
Let's create a table named "sales" and insert some sample data:
CREATE TABLE sales (
id INTEGER,
product VARCHAR(50),
price FLOAT
);
INSERT INTO sales (id, product, price)
VALUES (1, 'Product A', 10.5),
(2, 'Product B', 20.75),
(3, 'Product C', 30.0),
(4, 'Product D', 15.25),
(5, 'Product E', 25.5);
Query: Using AVG() Function
Now, let's use the AVG() function to find the average price of all products in the "sales" table:
SELECT AVG(price) AS avg_price
FROM sales;
The result should look like this:
| avg_price |
| --------- |
| 20.4 |
1.7 - AVG_IF
The suffix -If can be appended to the name of any aggregate function. In this case, the aggregate function accepts an extra argument – a condition.
Analyze Syntax
func.avg_if(<column>, <cond>)
Analyze Examples
func.avg_if(table.salary, table.department=='IT').alias('avg_salary_it')
| avg_salary_it |
|-----------------|
| 65000.0 |
SQL Syntax
SQL Examples
Create a Table and Insert Sample Data
CREATE TABLE employees (
id INT,
salary INT,
department VARCHAR
);
INSERT INTO employees (id, salary, department)
VALUES (1, 50000, 'HR'),
(2, 60000, 'IT'),
(3, 55000, 'HR'),
(4, 70000, 'IT'),
(5, 65000, 'IT');
Query Demo: Calculate Average Salary for IT Department
SELECT AVG_IF(salary, department = 'IT') AS avg_salary_it
FROM employees;
Result
| avg_salary_it |
|-----------------|
| 65000.0 |
1.8 - COUNT
The COUNT() function returns the number of records returned by a SELECT query.
Caution: NULL values are not counted.
Analyze Syntax
Analyze Examples
func.count(table.grade).alias('count_valid_grades')
| count_valid_grades |
|--------------------|
| 4 |
SQL Syntax
Arguments
Arguments | Description |
---|
<expr> | Any expression. This may be a column name, the result of another function, or a math operation.
* is also allowed, to indicate pure row counting. |
Return Type
An integer.
SQL Examples
Create a Table and Insert Sample Data
CREATE TABLE students (
id INT,
name VARCHAR,
age INT,
grade FLOAT NULL
);
INSERT INTO students (id, name, age, grade)
VALUES (1, 'John', 21, 85),
(2, 'Emma', 22, NULL),
(3, 'Alice', 23, 90),
(4, 'Michael', 21, 88),
(5, 'Sophie', 22, 92);
Query Demo: Count Students with Valid Grades
SELECT COUNT(grade) AS count_valid_grades
FROM students;
Result
| count_valid_grades |
|--------------------|
| 4 |
1.9 - COUNT_DISTINCT
Aggregate function.
The count(distinct ...) function calculates the unique value of a set of values.
To obtain an estimated result from large data sets with little memory and time, consider using APPROX_COUNT_DISTINCT.
Caution: NULL values are not counted.
Analyze Syntax
func.count_distinct(<column>)
Analyze Examples
func.count_distinct(table.category).alias('unique_categories')
| unique_categories |
|-------------------|
| 2 |
SQL Syntax
COUNT(distinct <expr> ...)
UNIQ(<expr>)
Arguments
Arguments | Description |
---|
<expr> | Any expression, size of the arguments is [1, 32] |
Return Type
UInt64
SQL Examples
Create a Table and Insert Sample Data
CREATE TABLE products (
id INT,
name VARCHAR,
category VARCHAR,
price FLOAT
);
INSERT INTO products (id, name, category, price)
VALUES (1, 'Laptop', 'Electronics', 1000),
(2, 'Smartphone', 'Electronics', 800),
(3, 'Tablet', 'Electronics', 600),
(4, 'Chair', 'Furniture', 150),
(5, 'Table', 'Furniture', 300);
Query Demo: Count Distinct Categories
SELECT COUNT(DISTINCT category) AS unique_categories
FROM products;
Result
| unique_categories |
|-------------------|
| 2 |
1.10 - COUNT_IF
The suffix _IF
can be appended to the name of any aggregate function. In this case, the aggregate function accepts an extra argument – a condition.
Analyze Syntax
func.count_if(<column>, <cond>)
Analyze Examples
func.count_if(table.status, table.status=='Completed').alias('completed_orders')
| completed_orders |
|------------------|
| 3 |
SQL Example
COUNT_IF(<column>, <cond>)
SQL Examples
Create a Table and Insert Sample Data
CREATE TABLE orders (
id INT,
customer_id INT,
status VARCHAR,
total FLOAT
);
INSERT INTO orders (id, customer_id, status, total)
VALUES (1, 1, 'completed', 100),
(2, 2, 'completed', 200),
(3, 1, 'pending', 150),
(4, 3, 'completed', 250),
(5, 2, 'pending', 300);
Query Demo: Count Completed Orders
SELECT COUNT_IF(status, status = 'completed') AS completed_orders
FROM orders;
Result
| completed_orders |
|------------------|
| 3 |
1.11 - COVAR_POP
COVAR_POP returns the population covariance of a set of number pairs.
Analyze Syntax
func.covar_pop(<expr1>, <expr2>)
Analyze Examples
func.covar_pop(table.units_sold, table.revenue).alias('covar_pop_units_revenue')
| covar_pop_units_revenue |
|-------------------------|
| 20000.0 |
SQL Syntax
COVAR_POP(<expr1>, <expr2>)
Arguments
Arguments | Description |
---|
<expr1> | Any numerical expression |
<expr2> | Any numerical expression |
Return Type
float64
SQL Examples
Create a Table and Insert Sample Data
CREATE TABLE product_sales (
id INT,
product_id INT,
units_sold INT,
revenue FLOAT
);
INSERT INTO product_sales (id, product_id, units_sold, revenue)
VALUES (1, 1, 10, 1000),
(2, 2, 20, 2000),
(3, 3, 30, 3000),
(4, 4, 40, 4000),
(5, 5, 50, 5000);
Query Demo: Calculate Population Covariance between Units Sold and Revenue
SELECT COVAR_POP(units_sold, revenue) AS covar_pop_units_revenue
FROM product_sales;
Result
| covar_pop_units_revenue |
|-------------------------|
| 20000.0 |
1.12 - COVAR_SAMP
Aggregate function.
The covar_samp() function returns the sample covariance (Σ((x - x̅)(y - y̅)) / (n - 1)) of two data columns.
Caution: NULL values are not counted.
Analyze Syntax
func.covar_samp(<expr1>, <expr2>)
Analyze Examples
func.covar_samp(table.items_sold, table.profit).alias('covar_samp_items_profit')
| covar_samp_items_profit |
|-------------------------|
| 250000.0 |
SQL Syntax
COVAR_SAMP(<expr1>, <expr2>)
Arguments
Arguments | Description |
---|
<expr1> | Any numerical expression |
<expr2> | Any numerical expression |
Return Type
float64, when n <= 1, returns +∞.
SQL Examples
Create a Table and Insert Sample Data
CREATE TABLE store_sales (
id INT,
store_id INT,
items_sold INT,
profit FLOAT
);
INSERT INTO store_sales (id, store_id, items_sold, profit)
VALUES (1, 1, 100, 1000),
(2, 2, 200, 2000),
(3, 3, 300, 3000),
(4, 4, 400, 4000),
(5, 5, 500, 5000);
Query Demo: Calculate Sample Covariance between Items Sold and Profit
SELECT COVAR_SAMP(items_sold, profit) AS covar_samp_items_profit
FROM store_sales;
Result
| covar_samp_items_profit |
|-------------------------|
| 250000.0 |
1.13 - GROUP_ARRAY_MOVING_AVG
The GROUP_ARRAY_MOVING_AVG function calculates the moving average of input values. The function can take the window size as a parameter. If left unspecified, the function takes the window size equal to the number of input values.
Analyze Syntax
func.group_array_moving_avg(<expr1>)
Analyze Examples
table.user_id, func.group_array_moving_avg(table.request_num).alias('avg_request_num')
| user_id | avg_request_num |
|---------|------------------|
| 1 | [5.0,11.5,21.5] |
| 3 | [10.0,22.5,35.0] |
| 2 | [7.5,18.0,31.0] |
SQL Syntax
GROUP_ARRAY_MOVING_AVG(<expr>)
GROUP_ARRAY_MOVING_AVG(<window_size>)(<expr>)
Arguments
Arguments | Description |
---|
<window_size> | Any numerical expression |
<expr> | Any numerical expression |
Return Type
Returns an Array with elements of double or decimal depending on the source data type.
SQL Examples
-- Create a table and insert sample data
CREATE TABLE hits (
user_id INT,
request_num INT
);
INSERT INTO hits (user_id, request_num)
VALUES (1, 10),
(2, 15),
(3, 20),
(1, 13),
(2, 21),
(3, 25),
(1, 30),
(2, 41),
(3, 45);
SELECT user_id, GROUP_ARRAY_MOVING_AVG(2)(request_num) AS avg_request_num
FROM hits
GROUP BY user_id;
| user_id | avg_request_num |
|---------|------------------|
| 1 | [5.0,11.5,21.5] |
| 3 | [10.0,22.5,35.0] |
| 2 | [7.5,18.0,31.0] |
1.14 - GROUP_ARRAY_MOVING_SUM
The GROUP_ARRAY_MOVING_SUM function calculates the moving sum of input values. The function can take the window size as a parameter. If left unspecified, the function takes the window size equal to the number of input values.
Analyze Syntax
func.group_array_moving_sum(<expr>)
Analyze Examples
table.user_id, func.group_array_moving_sum(table.request_num)
| user_id | request_num |
|---------|-------------|
| 1 | [10,23,43] |
| 2 | [20,45,70] |
| 3 | [15,36,62] |
SQL Syntax
GROUP_ARRAY_MOVING_SUM(<expr>)
GROUP_ARRAY_MOVING_SUM(<window_size>)(<expr>)
Arguments
Arguments | Description |
---|
<window_size> | Any numerical expression |
<expr> | Any numerical expression |
Return Type
Returns an Array with elements that are of the same type as the original data.
SQL Examples
-- Create a table and insert sample data
CREATE TABLE hits (
user_id INT,
request_num INT
);
INSERT INTO hits (user_id, request_num)
VALUES (1, 10),
(2, 15),
(3, 20),
(1, 13),
(2, 21),
(3, 25),
(1, 30),
(2, 41),
(3, 45);
SELECT user_id, GROUP_ARRAY_MOVING_SUM(2)(request_num) AS request_num
FROM hits
GROUP BY user_id;
| user_id | request_num |
|---------|-------------|
| 1 | [10,23,43] |
| 2 | [20,45,70] |
| 3 | [15,36,62] |
1.15 - KURTOSIS
Aggregate function.
The KURTOSIS()
function returns the excess kurtosis of all input values.
Analyze Syntax
Analyze Examples
func.kurtosis(table.price).alias('excess_kurtosis')
| excess_kurtosis |
|-------------------------|
| 0.06818181325581445 |
SQL Syntax
Arguments
Arguments | Description |
---|
<expr> | Any numerical expression |
Return Type
Nullable Float64.
SQL Examples
Create a Table and Insert Sample Data
CREATE TABLE stock_prices (
id INT,
stock_symbol VARCHAR,
price FLOAT
);
INSERT INTO stock_prices (id, stock_symbol, price)
VALUES (1, 'AAPL', 150),
(2, 'AAPL', 152),
(3, 'AAPL', 148),
(4, 'AAPL', 160),
(5, 'AAPL', 155);
Query Demo: Calculate Excess Kurtosis for Apple Stock Prices
SELECT KURTOSIS(price) AS excess_kurtosis
FROM stock_prices
WHERE stock_symbol = 'AAPL';
Result
| excess_kurtosis |
|-------------------------|
| 0.06818181325581445 |
1.16 - MAX
Aggregate function.
The MAX() function returns the maximum value in a set of values.
Analyze Syntax
Analyze Examples
table.city, func.max(table.temperature).alias('max_temperature')
| city | max_temperature |
|------------|-----------------|
| New York | 32 |
SQL Syntax
MAX(<expr>)
Arguments
Arguments | Description |
---|
<expr> | Any expression |
Return Type
The maximum value, in the type of the value.
SQL Examples
Create a Table and Insert Sample Data
CREATE TABLE temperatures (
id INT,
city VARCHAR,
temperature FLOAT
);
INSERT INTO temperatures (id, city, temperature)
VALUES (1, 'New York', 30),
(2, 'New York', 28),
(3, 'New York', 32),
(4, 'Los Angeles', 25),
(5, 'Los Angeles', 27);
Query Demo: Find Maximum Temperature for New York City
SELECT city, MAX(temperature) AS max_temperature
FROM temperatures
WHERE city = 'New York'
GROUP BY city;
Result
| city | max_temperature |
|------------|-----------------|
| New York | 32 |
1.17 - MAX_IF
The suffix _IF
can be appended to the name of any aggregate function. In this case, the aggregate function accepts an extra argument – a condition.
Analyze Syntax
func.max_if(<column>, <cond>)
Analyze Examples
func.max_if(table.revenue, table.salesperson_id==1).alias('max_revenue_salesperson_1')
| max_revenue_salesperson_1 |
|---------------------------|
| 3000 |
SQL Example
SQL Examples
Create a Table and Insert Sample Data
CREATE TABLE sales (
id INT,
salesperson_id INT,
product_id INT,
revenue FLOAT
);
INSERT INTO sales (id, salesperson_id, product_id, revenue)
VALUES (1, 1, 1, 1000),
(2, 1, 2, 2000),
(3, 1, 3, 3000),
(4, 2, 1, 1500),
(5, 2, 2, 2500);
Query Demo: Find Maximum Revenue for Salesperson with ID 1
SELECT MAX_IF(revenue, salesperson_id = 1) AS max_revenue_salesperson_1
FROM sales;
Result
| max_revenue_salesperson_1 |
|---------------------------|
| 3000 |
1.18 - MEDIAN
Aggregate function.
The MEDIAN() function computes the median of a numeric data sequence.
Caution: NULL values are not counted.
Analyze Syntax
Analyze Examples
func.median(table.score).alias('median_score')
| median_score |
|----------------|
| 85.0 |
SQL Syntax
Arguments
Arguments | Description |
---|
<expr> | Any numerical expression |
Return Type
the type of the value.
SQL Examples
Create a Table and Insert Sample Data
CREATE TABLE exam_scores (
id INT,
student_id INT,
score INT
);
INSERT INTO exam_scores (id, student_id, score)
VALUES (1, 1, 80),
(2, 2, 90),
(3, 3, 75),
(4, 4, 95),
(5, 5, 85);
Query Demo: Calculate Median Exam Score
SELECT MEDIAN(score) AS median_score
FROM exam_scores;
Result
| median_score |
|----------------|
| 85.0 |
1.19 - MEDIAN_TDIGEST
Computes the median of a numeric data sequence using the t-digest algorithm.
Caution: NULL values are not included in the calculation.
Analyze Syntax
func.median_tdigest(<expr>)
Analyze Examples
func.median_tdigest(table.score).alias('median_score')
| median_score |
|----------------|
| 85.0 |
SQL Syntax
Arguments
Arguments | Description |
---|
<expr> | Any numerical expression |
Return Type
Returns a value of the same data type as the input values.
SQL Examples
-- Create a table and insert sample data
CREATE TABLE exam_scores (
id INT,
student_id INT,
score INT
);
INSERT INTO exam_scores (id, student_id, score)
VALUES (1, 1, 80),
(2, 2, 90),
(3, 3, 75),
(4, 4, 95),
(5, 5, 85);
-- Calculate median exam score
SELECT MEDIAN_TDIGEST(score) AS median_score
FROM exam_scores;
| median_score |
|----------------|
| 85.0 |
1.20 - MIN
Aggregate function.
The MIN() function returns the minimum value in a set of values.
Analyze Syntax
Analyze Examples
table.station_id, func.min(table.price).alias('min_price')
| station_id | min_price |
|------------|-----------|
| 1 | 3.45 |
SQL Syntax
MIN(<expr>)
Arguments
Arguments | Description |
---|
<expr> | Any expression |
Return Type
The minimum value, in the type of the value.
SQL Examples
title: MIN
Aggregate function.
The MIN() function returns the minimum value in a set of values.
SQL Syntax
MIN(expression)
Arguments
Arguments | Description |
---|
expression | Any expression |
Return Type
The minimum value, in the type of the value.
SQL Examples
Create a Table and Insert Sample Data
CREATE TABLE gas_prices (
id INT,
station_id INT,
price FLOAT
);
INSERT INTO gas_prices (id, station_id, price)
VALUES (1, 1, 3.50),
(2, 1, 3.45),
(3, 1, 3.55),
(4, 2, 3.40),
(5, 2, 3.35);
Query Demo: Find Minimum Gas Price for Station 1
SELECT station_id, MIN(price) AS min_price
FROM gas_prices
WHERE station_id = 1
GROUP BY station_id;
Result
| station_id | min_price |
|------------|-----------|
| 1 | 3.45 |
1.21 - MIN_IF
The suffix _IF
can be appended to the name of any aggregate function. In this case, the aggregate function accepts an extra argument – a condition.
Analyze Syntax
func.min_if(<column>, <cond>)
Analyze Examples
func.min_if(table.budget, table.departing=='IT').alias('min_it_budget')
| min_it_budget |
|---------------|
| 2000 |
SQL Syntax
MIN_IF(<column>, <cond>)
SQL Examples
Create a Table and Insert Sample Data
CREATE TABLE project_budgets (
id INT,
project_id INT,
department VARCHAR,
budget FLOAT
);
INSERT INTO project_budgets (id, project_id, department, budget)
VALUES (1, 1, 'HR', 1000),
(2, 1, 'IT', 2000),
(3, 1, 'Marketing', 3000),
(4, 2, 'HR', 1500),
(5, 2, 'IT', 2500);
Query Demo: Find Minimum Budget for IT Department
SELECT MIN_IF(budget, department = 'IT') AS min_it_budget
FROM project_budgets;
Result
| min_it_budget |
|---------------|
| 2000 |
1.22 - QUANTILE_CONT
Aggregate function.
The QUANTILE_CONT() function computes the interpolated quantile number of a numeric data sequence.
Caution: NULL values are not counted.
Analyze Syntax
func.quantile_cont(<levels>, <expr>)
Analyze Examples
func.quantile_cont(0.5, table.sales_amount).alias('median_sales_amount')
| median_sales_amount |
|-----------------------|
| 6000.0 |
SQL Syntax
QUANTILE_CONT(<levels>)(<expr>)
QUANTILE_CONT(level1, level2, ...)(<expr>)
Arguments
Arguments | Description |
---|
<level(s) | level(s) of quantile. Each level is constant floating-point number from 0 to 1. We recommend using a level value in the range of [0.01, 0.99] |
<expr> | Any numerical expression |
Return Type
Float64 or float64 array based on level number.
SQL Examples
Create a Table and Insert Sample Data
CREATE TABLE sales_data (
id INT,
sales_person_id INT,
sales_amount FLOAT
);
INSERT INTO sales_data (id, sales_person_id, sales_amount)
VALUES (1, 1, 5000),
(2, 2, 5500),
(3, 3, 6000),
(4, 4, 6500),
(5, 5, 7000);
Query Demo: Calculate 50th Percentile (Median) of Sales Amount using Interpolation
SELECT QUANTILE_CONT(0.5)(sales_amount) AS median_sales_amount
FROM sales_data;
Result
| median_sales_amount |
|-----------------------|
| 6000.0 |
1.23 - QUANTILE_DISC
Aggregate function.
The QUANTILE_DISC()
function computes the exact quantile number of a numeric data sequence.
The QUANTILE
alias to QUANTILE_DISC
Caution: NULL values are not counted.
Analyze Syntax
func.quantile_disc(<levels>, <expr>)
Analyze Examples
func.quantile_disc([0.25, 0.75], table.salary).alias('salary_quantiles')
| salary_quantiles |
|---------------------|
| [55000.0, 65000.0] |
SQL Syntax
QUANTILE_DISC(<levels>)(<expr>)
QUANTILE_DISC(level1, level2, ...)(<expr>)
Arguments
Arguments | Description |
---|
level(s) | level(s) of quantile. Each level is constant floating-point number from 0 to 1. We recommend using a level value in the range of [0.01, 0.99] |
<expr> | Any numerical expression |
Return Type
InputType or array of InputType based on level number.
SQL Examples
Create a Table and Insert Sample Data
CREATE TABLE salary_data (
id INT,
employee_id INT,
salary FLOAT
);
INSERT INTO salary_data (id, employee_id, salary)
VALUES (1, 1, 50000),
(2, 2, 55000),
(3, 3, 60000),
(4, 4, 65000),
(5, 5, 70000);
Query Demo: Calculate 25th and 75th Percentile of Salaries
SELECT QUANTILE_DISC(0.25, 0.75)(salary) AS salary_quantiles
FROM salary_data;
Result
| salary_quantiles |
|---------------------|
| [55000.0, 65000.0] |
1.24 - QUANTILE_TDIGEST
import FunctionDescription from '@site/src/components/FunctionDescription';
Computes an approximate quantile of a numeric data sequence using the t-digest algorithm.
Caution: NULL values are not included in the calculation.
Analyze Syntax
func.quantile_tdigest(<levels>, <expr>)
Analyze Examples
func.quantile_tdigest([0.5, 0.8], table.sales_amount).alias('sales_amounts')
| sales_amounts |
|-----------------------+
| [6000.0,7000.0] |
SQL Syntax
QUANTILE_TDIGEST(<level1>[, <level2>, ...])(<expr>)
Arguments
Arguments | Description |
---|
<level n> | A level of quantile represents a constant floating-point number ranging from 0 to 1. It is recommended to use a level value in the range of [0.01, 0.99]. |
<expr> | Any numerical expression |
Return Type
Returns either a Float64 value or an array of Float64 values, depending on the number of quantile levels specified.
SQL Examples
-- Create a table and insert sample data
CREATE TABLE sales_data (
id INT,
sales_person_id INT,
sales_amount FLOAT
);
INSERT INTO sales_data (id, sales_person_id, sales_amount)
VALUES (1, 1, 5000),
(2, 2, 5500),
(3, 3, 6000),
(4, 4, 6500),
(5, 5, 7000);
SELECT QUANTILE_TDIGEST(0.5)(sales_amount) AS median_sales_amount
FROM sales_data;
median_sales_amount|
-------------------+
6000.0|
SELECT QUANTILE_TDIGEST(0.5, 0.8)(sales_amount)
FROM sales_data;
quantile_tdigest(0.5, 0.8)(sales_amount)|
----------------------------------------+
[6000.0,7000.0] |
1.25 - QUANTILE_TDIGEST_WEIGHTED
import FunctionDescription from '@site/src/components/FunctionDescription';
Computes an approximate quantile of a numeric data sequence using the t-digest algorithm.
This function takes into account the weight of each sequence member. Memory consumption is log(n), where n is a number of values.
Caution: NULL values are not included in the calculation.
Analyze Syntax
func.quantile_tdigest_weighted(<levels>, <expr>, <weight_expr>)
Analyze Examples
func.quantile_tdigest_weighted([0.5, 0.8], table.sales_amount, 1).alias('sales_amounts')
| sales_amounts |
|-----------------------+
| [6000.0,7000.0] |
SQL Syntax
QUANTILE_TDIGEST_WEIGHTED(<level1>[, <level2>, ...])(<expr>, <weight_expr>)
Arguments
Arguments | Description |
---|
<level n> | A level of quantile represents a constant floating-point number ranging from 0 to 1. It is recommended to use a level value in the range of [0.01, 0.99]. |
<expr> | Any numerical expression |
<weight_expr> | Any unsigned integer expression. Weight is a number of value occurrences. |
Return Type
Returns either a Float64 value or an array of Float64 values, depending on the number of quantile levels specified.
SQL Examples
-- Create a table and insert sample data
CREATE TABLE sales_data (
id INT,
sales_person_id INT,
sales_amount FLOAT
);
INSERT INTO sales_data (id, sales_person_id, sales_amount)
VALUES (1, 1, 5000),
(2, 2, 5500),
(3, 3, 6000),
(4, 4, 6500),
(5, 5, 7000);
SELECT QUANTILE_TDIGEST_WEIGHTED(0.5)(sales_amount, 1) AS median_sales_amount
FROM sales_data;
median_sales_amount|
-------------------+
6000.0|
SELECT QUANTILE_TDIGEST_WEIGHTED(0.5, 0.8)(sales_amount, 1)
FROM sales_data;
quantile_tdigest_weighted(0.5, 0.8)(sales_amount)|
-------------------------------------------------+
[6000.0,7000.0] |
1.26 - RETENTION
Aggregate function
The RETENTION() function takes as arguments a set of conditions from 1 to 32 arguments of type UInt8 that indicate whether a certain condition was met for the event.
Any condition can be specified as an argument (as in WHERE).
The conditions, except the first, apply in pairs: the result of the second will be true if the first and second are true, of the third if the first and third are true, etc.
Analyze Syntax
func.retention(<cond1> , <cond2> , ..., <cond32>)
Analyze Examples
table.user_id, func.retention(table.event_type=='signup', table.event_type='login', table.event_type='purchase').alias('sales_amounts')
| user_id | retention |
|---------|-----------|
| 1 | [1, 1, 0] |
| 2 | [1, 0, 1] |
| 3 | [1, 1, 0] |
SQL Syntax
RETENTION( <cond1> , <cond2> , ..., <cond32> );
Arguments
Arguments | Description |
---|
<cond> | An expression that returns a Boolean result |
Return Type
The array of 1 or 0.
SQL Examples
Create a Table and Insert Sample Data
CREATE TABLE user_events (
id INT,
user_id INT,
event_date DATE,
event_type VARCHAR
);
INSERT INTO user_events (id, user_id, event_date, event_type)
VALUES (1, 1, '2022-01-01', 'signup'),
(2, 1, '2022-01-02', 'login'),
(3, 2, '2022-01-01', 'signup'),
(4, 2, '2022-01-03', 'purchase'),
(5, 3, '2022-01-01', 'signup'),
(6, 3, '2022-01-02', 'login');
Query Demo: Calculate User Retention Based on Signup, Login, and Purchase Events
SELECT
user_id,
RETENTION(event_type = 'signup', event_type = 'login', event_type = 'purchase') AS retention
FROM user_events
GROUP BY user_id;
Result
| user_id | retention |
|---------|-----------|
| 1 | [1, 1, 0] |
| 2 | [1, 0, 1] |
| 3 | [1, 1, 0] |
1.27 - SKEWNESS
Aggregate function.
The SKEWNESS()
function returns the skewness of all input values.
Analyze Syntax
Analyze Examples
func.skewness(table.temperature).alias('temperature_skewness')
| temperature_skewness |
|----------------------|
| 0.68 |
SQL Syntax
Arguments
Arguments | Description |
---|
<expr> | Any numerical expression |
Return Type
Nullable Float64.
SQL Examples
Create a Table and Insert Sample Data
CREATE TABLE temperature_data (
id INT,
city_id INT,
temperature FLOAT
);
INSERT INTO temperature_data (id, city_id, temperature)
VALUES (1, 1, 60),
(2, 1, 65),
(3, 1, 62),
(4, 2, 70),
(5, 2, 75);
Query Demo: Calculate Skewness of Temperature Data
SELECT SKEWNESS(temperature) AS temperature_skewness
FROM temperature_data;
Result
| temperature_skewness |
|----------------------|
| 0.68 |
1.28 - STDDEV_POP
Aggregate function.
The STDDEV_POP() function returns the population standard deviation(the square root of VAR_POP()) of an expression.
Note: STD() or STDDEV() can also be used, which are equivalent but not standard SQL.
Caution: NULL values are not counted.
Analyze Syntax
Analyze Examples
func.stddev_pop(table.score).alias('test_score_stddev_pop')
| test_score_stddev_pop |
|-----------------------|
| 7.07107 |
SQL Syntax
STDDEV_POP(<expr>)
STDDEV(<expr>)
STD(<expr>)
Arguments
Arguments | Description |
---|
<expr> | Any numerical expression |
Return Type
double
SQL Examples
Create a Table and Insert Sample Data
CREATE TABLE test_scores (
id INT,
student_id INT,
score FLOAT
);
INSERT INTO test_scores (id, student_id, score)
VALUES (1, 1, 80),
(2, 2, 85),
(3, 3, 90),
(4, 4, 95),
(5, 5, 100);
Query Demo: Calculate Population Standard Deviation of Test Scores
SELECT STDDEV_POP(score) AS test_score_stddev_pop
FROM test_scores;
Result
| test_score_stddev_pop |
|-----------------------|
| 7.07107 |
1.29 - STDDEV_SAMP
Aggregate function.
The STDDEV_SAMP() function returns the sample standard deviation(the square root of VAR_SAMP()) of an expression.
Caution: NULL values are not counted.
Analyze Syntax
Analyze Examples
func.stddev_samp(table.height).alias('height_stddev_samp')
| height_stddev_samp |
|--------------------|
| 0.240 |
SQL Syntax
Arguments
Arguments | Description |
---|
<expr> | Any numerical expression |
Return Type
double
SQL Examples
Create a Table and Insert Sample Data
CREATE TABLE height_data (
id INT,
person_id INT,
height FLOAT
);
INSERT INTO height_data (id, person_id, height)
VALUES (1, 1, 5.8),
(2, 2, 6.1),
(3, 3, 5.9),
(4, 4, 5.7),
(5, 5, 6.3);
Query Demo: Calculate Sample Standard Deviation of Heights
SELECT STDDEV_SAMP(height) AS height_stddev_samp
FROM height_data;
Result
| height_stddev_samp |
|--------------------|
| 0.240 |
1.30 - STRING_AGG
Aggregate function.
The STRING_AGG() function converts all the non-NULL values of a column to String, separated by the delimiter.
Analyze Syntax
func.string_agg(<expr> [, delimiter])
Analyze Examples
func.string_agg(table.language_name).alias('concatenated_languages')
| concatenated_languages |
|-----------------------------------------|
| Python, JavaScript, Java, C#, Ruby |
SQL Syntax
STRING_AGG(<expr>)
STRING_AGG(<expr> [, delimiter])
Note:If <expr>
is not a String expression, should use ::VARCHAR
to convert.
For example:
SELECT string_agg(number::VARCHAR, '|') AS s FROM numbers(5);
+-----------+
| s |
+-----------+
| 0|1|2|3|4 |
+-----------+
Arguments
Arguments | Description |
---|
<expr> | Any string expression (if not a string, use ::VARCHAR to convert) |
delimiter | Optional constant String, if not specified, use empty String |
Return Type
the String type
SQL Examples
Create a Table and Insert Sample Data
CREATE TABLE programming_languages (
id INT,
language_name VARCHAR
);
INSERT INTO programming_languages (id, language_name)
VALUES (1, 'Python'),
(2, 'JavaScript'),
(3, 'Java'),
(4, 'C#'),
(5, 'Ruby');
Query Demo: Concatenate Programming Language Names with a Delimiter
SELECT STRING_AGG(language_name, ', ') AS concatenated_languages
FROM programming_languages;
Result
| concatenated_languages |
|------------------------------------------|
| Python, JavaScript, Java, C#, Ruby |
1.31 - SUM
Aggregate function.
The SUM() function calculates the sum of a set of values.
Caution: NULL values are not counted.
Analyze Syntax
Analyze Examples
func.sum(table.quantity).alias('total_quantity_sold')
| total_quantity_sold |
|---------------------|
| 41 |
SQL Syntax
SUM(<expr>)
Arguments
Arguments | Description |
---|
<expr> | Any numerical expression |
Return Type
A double if the input type is double, otherwise integer.
SQL Examples
Create a Table and Insert Sample Data
CREATE TABLE sales_data (
id INT,
product_id INT,
quantity INT
);
INSERT INTO sales_data (id, product_id, quantity)
VALUES (1, 1, 10),
(2, 2, 5),
(3, 3, 8),
(4, 4, 3),
(5, 5, 15);
Query Demo: Calculate the Total Quantity of Products Sold
SELECT SUM(quantity) AS total_quantity_sold
FROM sales_data;
Result
| total_quantity_sold |
|---------------------|
| 41 |
1.32 - SUM_IF
The suffix -If can be appended to the name of any aggregate function. In this case, the aggregate function accepts an extra argument – a condition.
Analyze Syntax
func.sum_if(<column>, <cond>)
Analyze Examples
func.sum_if(table.amount, table.status=='Completed').alias('total_amount_completed')
| total_amount_completed |
|------------------------|
| 270.0 |
SQL Syntax
SUM_IF(<column>, <cond>)
SQL Examples
Create a Table and Insert Sample Data
CREATE TABLE order_data (
id INT,
customer_id INT,
amount FLOAT,
status VARCHAR
);
INSERT INTO order_data (id, customer_id, amount, status)
VALUES (1, 1, 100, 'Completed'),
(2, 2, 50, 'Completed'),
(3, 3, 80, 'Cancelled'),
(4, 4, 120, 'Completed'),
(5, 5, 75, 'Cancelled');
Query Demo: Calculate the Total Amount of Completed Orders
SELECT SUM_IF(amount, status = 'Completed') AS total_amount_completed
FROM order_data;
Result
| total_amount_completed |
|------------------------|
| 270.0 |
1.33 - WINDOW_FUNNEL
Funnel Analysis

Similar to windowFunnel
in ClickHouse (they were created by the same author), it searches for event chains in a sliding time window and calculates the maximum number of events from the chain.
The function works according to the algorithm:
The function searches for data that triggers the first condition in the chain and sets the event counter to 1. This is the moment when the sliding window starts.
If events from the chain occur sequentially within the window, the counter is incremented. If the sequence of events is disrupted, the counter isn’t incremented.
If the data has multiple event chains at varying completion points, the function will only output the size of the longest chain.
SQL Syntax
WINDOW_FUNNEL( <window> )( <timestamp>, <cond1>, <cond2>, ..., <condN> )
Arguments
<timestamp>
— Name of the column containing the timestamp. Data types supported: integer types and datetime types.<cond>
— Conditions or data describing the chain of events. Must be Boolean
datatype.
Parameters
<window>
— Length of the sliding window, it is the time interval between the first and the last condition. The unit of window
depends on the timestamp
itself and varies. Determined using the expression timestamp of cond1 <= timestamp of cond2 <= ... <= timestamp of condN <= timestamp of cond1 + window
.
Returned value
The maximum number of consecutive triggered conditions from the chain within the sliding time window.
All the chains in the selection are analyzed.
Type: UInt8
.
Example
Determine if a set period of time is enough for the user to SELECT a phone and purchase it twice in the online store.
Set the following chain of events:
- The user logged into their account on the store (
event_name = 'login'
). - The user land the page (
event_name = 'visit'
). - The user adds to the shopping cart(
event_name = 'cart'
). - The user complete the purchase (
event_name = 'purchase'
).
CREATE TABLE events(user_id BIGINT, event_name VARCHAR, event_timestamp TIMESTAMP);
INSERT INTO events VALUES(100123, 'login', '2022-05-14 10:01:00');
INSERT INTO events VALUES(100123, 'visit', '2022-05-14 10:02:00');
INSERT INTO events VALUES(100123, 'cart', '2022-05-14 10:04:00');
INSERT INTO events VALUES(100123, 'purchase', '2022-05-14 10:10:00');
INSERT INTO events VALUES(100125, 'login', '2022-05-15 11:00:00');
INSERT INTO events VALUES(100125, 'visit', '2022-05-15 11:01:00');
INSERT INTO events VALUES(100125, 'cart', '2022-05-15 11:02:00');
INSERT INTO events VALUES(100126, 'login', '2022-05-15 12:00:00');
INSERT INTO events VALUES(100126, 'visit', '2022-05-15 12:01:00');
Input table:
+---------+------------+----------------------------+
| user_id | event_name | event_timestamp |
+---------+------------+----------------------------+
| 100123 | login | 2022-05-14 10:01:00.000000 |
| 100123 | visit | 2022-05-14 10:02:00.000000 |
| 100123 | cart | 2022-05-14 10:04:00.000000 |
| 100123 | purchase | 2022-05-14 10:10:00.000000 |
| 100125 | login | 2022-05-15 11:00:00.000000 |
| 100125 | visit | 2022-05-15 11:01:00.000000 |
| 100125 | cart | 2022-05-15 11:02:00.000000 |
| 100126 | login | 2022-05-15 12:00:00.000000 |
| 100126 | visit | 2022-05-15 12:01:00.000000 |
+---------+------------+----------------------------+
Find out how far the user user_id
could get through the chain in an hour window slides.
Query:
SELECT
level,
count() AS count
FROM
(
SELECT
user_id,
window_funnel(3600000000)(event_timestamp, event_name = 'login', event_name = 'visit', event_name = 'cart', event_name = 'purchase') AS level
FROM events
GROUP BY user_id
)
GROUP BY level ORDER BY level ASC;
Note: The event_timestamp
type is timestamp, 3600000000
is a hour time window.
Result:
+-------+-------+
| level | count |
+-------+-------+
| 2 | 1 |
| 3 | 1 |
| 4 | 1 |
+-------+-------+
- User
100126
level is 2 (login -> visit
) . - user
100125
level is 3 (login -> visit -> cart
). - User
100123
level is 4 (login -> visit -> cart -> purchase
).
2 - AI Functions
Using SQL-based AI Functions for Knowledge Base Search and Text Completion
This document demonstrates how to leverage PlaidCloud Lakehouse's built-in AI functions for creating document embeddings, searching for similar documents, and generating text completions based on context.
2.1 - AI_EMBEDDING_VECTOR
Creating embeddings using the ai_embedding_vector function in PlaidCloud Lakehouse
This document provides an overview of the ai_embedding_vector function in PlaidCloud Lakehouse and demonstrates how to create document embeddings using this function.
The main code implementation can be found here.
By default, PlaidCloud Lakehouse leverages the text-embedding-ada model for generating embeddings.
Note:Starting from PlaidCloud Lakehouse v1.1.47, PlaidCloud Lakehouse supports the Azure OpenAI service.
This integration offers improved data privacy.
To use Azure OpenAI, add the following configurations to the [query]
section:
# Azure OpenAI
openai_api_chat_base_url = "https://<name>.openai.azure.com/openai/deployments/<name>/"
openai_api_embedding_base_url = "https://<name>.openai.azure.com/openai/deployments/<name>/"
openai_api_version = "2023-03-15-preview"
Caution:PlaidCloud Lakehouse relies on (Azure) OpenAI for AI_EMBEDDING_VECTOR
and sends the embedding column data to (Azure) OpenAI.
They will only work when the PlaidCloud Lakehouse configuration includes the openai_api_key
, otherwise they will be inactive.
This function is available by default on PlaidCloud Lakehouse using an Azure OpenAI key. If you use them, you acknowledge that your data will be sent to Azure OpenAI by us.
Overview of ai_embedding_vector
The ai_embedding_vector
function in PlaidCloud Lakehouse is a built-in function that generates vector embeddings for text data. It is useful for natural language processing tasks, such as document similarity, clustering, and recommendation systems.
The function takes a text input and returns a high-dimensional vector that represents the input text's semantic meaning and context. The embeddings are created using pre-trained models on large text corpora, capturing the relationships between words and phrases in a continuous space.
Creating embeddings using ai_embedding_vector
To create embeddings for a text document using the ai_embedding_vector
function, follow the example below.
- Create a table to store the documents:
CREATE TABLE documents (
id INT,
title VARCHAR,
content VARCHAR,
embedding ARRAY(FLOAT32)
);
- Insert example documents into the table:
INSERT INTO documents(id, title, content)
VALUES
(1, 'A Brief History of AI', 'Artificial intelligence (AI) has been a fascinating concept of science fiction for decades...'),
(2, 'Machine Learning vs. Deep Learning', 'Machine learning and deep learning are two subsets of artificial intelligence...'),
(3, 'Neural Networks Explained', 'A neural network is a series of algorithms that endeavors to recognize underlying relationships...'),
- Generate the embeddings:
UPDATE documents SET embedding = ai_embedding_vector(content) WHERE length(embedding) = 0;
After running the query, the embedding column in the table will contain the generated embeddings.
The embeddings are stored as an array of FLOAT32
values in the embedding column, which has the ARRAY(FLOAT32)
column type.
You can now use these embeddings for various natural language processing tasks, such as finding similar documents or clustering documents based on their content.
- Inspect the embeddings:
SELECT length(embedding) FROM documents;
+-------------------+
| length(embedding) |
+-------------------+
| 1536 |
| 1536 |
| 1536 |
+-------------------+
The query above shows that the generated embeddings have a length of 1536(dimensions) for each document.
2.2 - AI_TEXT_COMPLETION
Generating text completions using the ai_text_completion function in PlaidCloud Lakehouse
This document provides an overview of the ai_text_completion
function in PlaidCloud Lakehouse and demonstrates how to generate text completions using this function.
The main code implementation can be found here.
Note:Starting from PlaidCloud Lakehouse v1.1.47, PlaidCloud Lakehouse supports the Azure OpenAI service.
This integration offers improved data privacy.
To use Azure OpenAI, add the following configurations to the [query]
section:
# Azure OpenAI
openai_api_chat_base_url = "https://<name>.openai.azure.com/openai/deployments/<name>/"
openai_api_embedding_base_url = "https://<name>.openai.azure.com/openai/deployments/<name>/"
openai_api_version = "2023-03-15-preview"
Caution:PlaidCloud Lakehouse relies on (Azure) OpenAI for AI_TEXT_COMPLETION
and sends the completion prompt data to (Azure) OpenAI.
They will only work when the PlaidCloud Lakehouse configuration includes the openai_api_key
, otherwise they will be inactive.
This function is available by default on PlaidCloud Lakehouse using an Azure OpenAI key. If you use them, you acknowledge that your data will be sent to Azure OpenAI by us.
Overview of ai_text_completion
The ai_text_completion
function in PlaidCloud Lakehouse is a built-in function that generates text completions based on a given prompt. It is useful for natural language processing tasks, such as question answering, text generation, and autocompletion systems.
The function takes a text prompt as input and returns a generated completion for the prompt. The completions are created using pre-trained models on large text corpora, capturing the relationships between words and phrases in a continuous space.
Generating text completions using ai_text_completion
Here is a simple example using the ai_text_completion
function in PlaidCloud Lakehouse to generate a text completion:
SELECT ai_text_completion('What is artificial intelligence?') AS completion;
Result:
+--------------------------------------------------------------------------------------------------------------------+
| completion |
+--------------------------------------------------------------------------------------------------------------------+
| Artificial intelligence (AI) is the field of study focused on creating machines and software capable of thinking, learning, and solving problems in a way that mimics human intelligence. This includes areas such as machine learning, natural language processing, computer vision, and robotics. |
+--------------------------------------------------------------------------------------------------------------------+
In this example, we provide the prompt "What is artificial intelligence?" to the ai_text_completion
function, and it returns a generated completion that briefly describes artificial intelligence.
2.3 - AI_TO_SQL
Converts natural language instructions into SQL queries with the latest model text-davinci-003
.
PlaidCloud Lakehouse offers an efficient solution for constructing SQL queries by incorporating OLAP and AI. Through this function, instructions written in a natural language can be converted into SQL query statements that align with the table schema. For example, the function can be provided with a sentence like "Get all items that cost 10 dollars or less" as an input and generate the corresponding SQL query "SELECT * FROM items WHERE price <= 10" as output.
The main code implementation can be found here.
Note: The SQL query statements generated adhere to the PostgreSQL standards, so they might require manual revisions to align with the syntax of PlaidCloud Lakehouse.
Note:Starting from PlaidCloud Lakehouse v1.1.47, PlaidCloud Lakehouse supports the Azure OpenAI service.
This integration offers improved data privacy.
To use Azure OpenAI, add the following configurations to the [query]
section:
# Azure OpenAI
openai_api_chat_base_url = "https://<name>.openai.azure.com/openai/deployments/<name>/"
openai_api_embedding_base_url = "https://<name>.openai.azure.com/openai/deployments/<name>/"
openai_api_version = "2023-03-15-preview"
Caution:PlaidCloud Lakehouse relies on (Azure) OpenAI for AI_TO_SQL
but only sends the table schema to (Azure) OpenAI, not the data.
They will only work when the PlaidCloud Lakehouse configuration includes the openai_api_key
, otherwise they will be inactive.
This function is available by default on PlaidCloud Lakehouse using an Azure OpenAI key. If you use them, you acknowledge that your table schema will be sent to Azure OpenAI by us.
Analyze Syntax
func.ai_to_sql('<natural-language-instruction>')
Analyze Examples
In this example, an SQL query statement is generated from an instruction with the AI_TO_SQL function, and the resulting statement is executed to obtain the query results.
func.ai_to_sql('List the total amount spent by users from the USA who are older than 30 years, grouped by their names, along with the number of orders they made in 2022')
A SQL statement is generated by the function as the output:
*************************** 1. row ***************************
database: openai
generated_sql: SELECT name, SUM(price) AS total_spent, COUNT(order_id) AS total_orders
FROM users
JOIN orders ON users.id = orders.user_id
WHERE country = 'USA' AND age > 30 AND order_date BETWEEN '2022-01-01' AND '2022-12-31'
GROUP BY name;
SQL Syntax
USE <your-database>;
SELECT * FROM ai_to_sql('<natural-language-instruction>');
Note:Obtain and Config OpenAI API Key
[query]
... ...
openai_api_key = "<your-key>"
SQL Examples
In this example, an SQL query statement is generated from an instruction with the AI_TO_SQL function, and the resulting statement is executed to obtain the query results.
- Prepare data.
CREATE DATABASE IF NOT EXISTS openai;
USE openai;
CREATE TABLE users(
id INT,
name VARCHAR,
age INT,
country VARCHAR
);
CREATE TABLE orders(
order_id INT,
user_id INT,
product_name VARCHAR,
price DECIMAL(10,2),
order_date DATE
);
-- Insert sample data into the users table
INSERT INTO users VALUES (1, 'Alice', 31, 'USA'),
(2, 'Bob', 32, 'USA'),
(3, 'Charlie', 45, 'USA'),
(4, 'Diana', 29, 'USA'),
(5, 'Eva', 35, 'Canada');
-- Insert sample data into the orders table
INSERT INTO orders VALUES (1, 1, 'iPhone', 1000.00, '2022-03-05'),
(2, 1, 'OpenAI Plus', 20.00, '2022-03-06'),
(3, 2, 'OpenAI Plus', 20.00, '2022-03-07'),
(4, 2, 'MacBook Pro', 2000.00, '2022-03-10'),
(5, 3, 'iPad', 500.00, '2022-03-12'),
(6, 3, 'AirPods', 200.00, '2022-03-14');
- Run the AI_TO_SQL function with an instruction written in English as the input.
SELECT * FROM ai_to_sql(
'List the total amount spent by users from the USA who are older than 30 years, grouped by their names, along with the number of orders they made in 2022');
A SQL statement is generated by the function as the output:
*************************** 1. row ***************************
database: openai
generated_sql: SELECT name, SUM(price) AS total_spent, COUNT(order_id) AS total_orders
FROM users
JOIN orders ON users.id = orders.user_id
WHERE country = 'USA' AND age > 30 AND order_date BETWEEN '2022-01-01' AND '2022-12-31'
GROUP BY name;
- Run the generated SQL statement to get the query results.
+---------+-------------+-------------+
| name | order_count | total_spent |
+---------+-------------+-------------+
| Bob | 2 | 2020.00 |
| Alice | 2 | 1020.00 |
| Charlie | 2 | 700.00 |
+---------+-------------+-------------+
2.4 - COSINE_DISTANCE
Measuring similarity using the cosine_distance function in PlaidCloud Lakehouse
This document provides an overview of the cosine_distance function in PlaidCloud Lakehouse and demonstrates how to measure document similarity using this function.
Note: The cosine_distance function performs vector computations within PlaidCloud Lakehouse and does not rely on the (Azure) OpenAI API.
The cosine_distance function in PlaidCloud Lakehouse is a built-in function that calculates the cosine distance between two vectors. It is commonly used in natural language processing tasks, such as document similarity and recommendation systems.
Cosine distance is a measure of similarity between two vectors, based on the cosine of the angle between them. The function takes two input vectors and returns a value between 0 and 1, with 0 indicating identical vectors and 1 indicating orthogonal (completely dissimilar) vectors.
Analyze Syntax
func.cosine_distance(<vector1>, <vector2>)
SQL Examples
Creating a Table and Inserting Sample Data
Let's create a table to store some sample text documents and their corresponding embeddings:
CREATE TABLE articles (
id INT,
title VARCHAR,
content VARCHAR,
embedding ARRAY(FLOAT32)
);
Now, let's insert some sample documents into the table:
INSERT INTO articles (id, title, content, embedding)
VALUES
(1, 'Python for Data Science', 'Python is a versatile programming language widely used in data science...', ai_embedding_vector('Python is a versatile programming language widely used in data science...')),
(2, 'Introduction to R', 'R is a popular programming language for statistical computing and graphics...', ai_embedding_vector('R is a popular programming language for statistical computing and graphics...')),
(3, 'Getting Started with SQL', 'Structured Query Language (SQL) is a domain-specific language used for managing relational databases...', ai_embedding_vector('Structured Query Language (SQL) is a domain-specific language used for managing relational databases...'));
Querying for Similar Documents
Now, let's find the documents that are most similar to a given query using the cosine_distance function:
SELECT
id,
title,
content,
cosine_distance(embedding, ai_embedding_vector('How to use Python in data analysis?')) AS similarity
FROM
articles
ORDER BY
similarity ASC
LIMIT 3;
Result:
+------+--------------------------+---------------------------------------------------------------------------------------------------------+------------+
| id | title | content | similarity |
+------+--------------------------+---------------------------------------------------------------------------------------------------------+------------+
| 1 | Python for Data Science | Python is a versatile programming language widely used in data science... | 0.1142081 |
| 2 | Introduction to R | R is a popular programming language for statistical computing and graphics... | 0.18741018 |
| 3 | Getting Started with SQL | Structured Query Language (SQL) is a domain-specific language used for managing relational databases... | 0.25137568 |
+------+--------------------------+---------------------------------------------------------------------------------------------------------+------------+
3 - Array Functions
This section provides reference information for the array functions in PlaidCloud Lakehouse.
3.1 - ARRAY_AGGREGATE
Aggregates elements in the array with an aggregate function.
Analyze Syntax
func.array_aggregate( <array>, '<agg_func>' )
Supported aggregate functions include avg
, count
, max
, min
, sum
, any
, stddev_samp
, stddev_pop
, stddev
, std
, median
, approx_count_distinct
, kurtosis
, and skewness
.
The syntax can be rewritten as func.array_<agg_func>( <array> )
. For example, func.array_avg( <array> )
.
Analyze Examples
func.array_aggregate([1, 2, 3, 4], 'sum'), func.array_sum([1, 2, 3, 4])
┌──────────────────────────────────────────────────────────────────────────┐
│ func.array_aggregate([1, 2, 3, 4], 'sum') │ func.array_sum([1, 2, 3, 4])│
├────────────────────────────────────────────┼─────────────────────────────┤
│ 10 │ 10 │
└──────────────────────────────────────────────────────────────────────────┘
SQL Syntax
ARRAY_AGGREGATE( <array>, '<agg_func>' )
Supported aggregate functions include avg
, count
, max
, min
, sum
, any
, stddev_samp
, stddev_pop
, stddev
, std
, median
, approx_count_distinct
, kurtosis
, and skewness
.
The syntax can be rewritten as ARRAY_<agg_func>( <array> )
. For example, ARRAY_AVG( <array> )
.
SQL Examples
SELECT ARRAY_AGGREGATE([1, 2, 3, 4], 'SUM'), ARRAY_SUM([1, 2, 3, 4]);
┌────────────────────────────────────────────────────────────────┐
│ array_aggregate([1, 2, 3, 4], 'sum') │ array_sum([1, 2, 3, 4]) │
├──────────────────────────────────────┼─────────────────────────┤
│ 10 │ 10 │
└────────────────────────────────────────────────────────────────┘
3.2 - ARRAY_APPEND
Prepends an element to the array.
Analyze Syntax
func.array_append( <array>, <element>)
Analyze Examples
func.array_append([3, 4], 5)
┌──────────────────────────────┐
│ func.array_append([3, 4], 5) │
├──────────────────────────────┤
│ [3,4,5] │
└──────────────────────────────┘
SQL Syntax
ARRAY_APPEND( <array>, <element>)
SQL Examples
SELECT ARRAY_APPEND([3, 4], 5);
┌─────────────────────────┐
│ array_append([3, 4], 5) │
├─────────────────────────┤
│ [3,4,5] │
└─────────────────────────┘
3.3 - ARRAY_APPLY
Alias for ARRAY_TRANSFORM.
3.4 - ARRAY_CONCAT
Concats two arrays.
Analyze Syntax
func.array_concat( <array1>, <array2> )
Analyze Examples
func.array_concat([1, 2], [3, 4])
┌────────────────────────────────────┐
│ func.array_concat([1, 2], [3, 4]) │
├────────────────────────────────────┤
│ [1,2,3,4] │
└────────────────────────────────────┘
SQL Syntax
ARRAY_CONCAT( <array1>, <array2> )
SQL Examples
SELECT ARRAY_CONCAT([1, 2], [3, 4]);
┌──────────────────────────────┐
│ array_concat([1, 2], [3, 4]) │
├──────────────────────────────┤
│ [1,2,3,4] │
└──────────────────────────────┘
3.5 - ARRAY_CONTAINS
Alias for CONTAINS.
3.6 - ARRAY_DISTINCT
Removes all duplicates and NULLs from the array without preserving the original order.
Analyze Syntax
func.array_distinct( <array> )
Analyze Examples
func.array_distinct([1, 2, 2, 4, 3])
┌───────────────────────────────────────┐
│ func.array_distinct([1, 2, 2, 4, 3]) │
├───────────────────────────────────────┤
│ [1,2,4,3] │
└───────────────────────────────────────┘
SQL Syntax
ARRAY_DISTINCT( <array> )
SQL Examples
SELECT ARRAY_DISTINCT([1, 2, 2, 4, 3]);
┌─────────────────────────────────┐
│ array_distinct([1, 2, 2, 4, 3]) │
├─────────────────────────────────┤
│ [1,2,4,3] │
└─────────────────────────────────┘
3.7 - ARRAY_FILTER
Constructs an array from those elements of the input array for which the lambda function returns true.
Analyze Syntax
func.array_filter( <array>, <lambda> )
Analyze Examples
func.array_filter([1, 2, 3], x -> (x > 1))
┌─────────────────────────────────────────────┐
│ func.array_filter([1, 2, 3], x -> (x > 1)) │
├─────────────────────────────────────────────┤
│ [2,3] │
└─────────────────────────────────────────────┘
SQL Syntax
ARRAY_FILTER( <array>, <lambda> )
SQL Examples
SELECT ARRAY_FILTER([1, 2, 3], x -> x > 1);
┌───────────────────────────────────────┐
│ array_filter([1, 2, 3], x -> (x > 1)) │
├───────────────────────────────────────┤
│ [2,3] │
└───────────────────────────────────────┘
3.8 - ARRAY_FLATTEN
Flattens nested arrays, converting them into a single-level array.
Analyze Syntax
func.array_flatten( <array> )
Analyze Examples
func.array_flatten([[1, 2], [3, 4, 5]])
┌──────────────────────────────────────────┐
│ func.array_flatten([[1, 2], [3, 4, 5]]) │
├──────────────────────────────────────────┤
│ [1,2,3,4,5] │
└──────────────────────────────────────────┘
SQL Syntax
SQL Examples
SELECT ARRAY_FLATTEN([[1,2], [3,4,5]]);
┌────────────────────────────────────┐
│ array_flatten([[1, 2], [3, 4, 5]]) │
├────────────────────────────────────┤
│ [1,2,3,4,5] │
└────────────────────────────────────┘
3.9 - ARRAY_GET
Alias for GET.
3.10 - ARRAY_INDEXOF
Returns the index(1-based) of an element if the array contains the element.
Analyze Syntax
func.array_indexof( <array>, <element> )
Analyze Examples
func.array_indexof([1, 2, 9], 9)
┌───────────────────────────────────┐
│ func.array_indexof([1, 2, 9], 9) │
├───────────────────────────────────┤
│ 3 │
└───────────────────────────────────┘
SQL Syntax
ARRAY_INDEXOF( <array>, <element> )
SQL Examples
SELECT ARRAY_INDEXOF([1, 2, 9], 9);
┌─────────────────────────────┐
│ array_indexof([1, 2, 9], 9) │
├─────────────────────────────┤
│ 3 │
└─────────────────────────────┘
3.11 - ARRAY_LENGTH
Returns the length of an array.
Analyze Syntax
func.array_length( <array> )
Analyze Examples
func.array_length([1, 2])
┌────────────────────────────┐
│ func.array_length([1, 2]) │
├────────────────────────────┤
│ 2 │
└────────────────────────────┘
SQL Syntax
SQL Examples
SELECT ARRAY_LENGTH([1, 2]);
┌──────────────────────┐
│ array_length([1, 2]) │
├──────────────────────┤
│ 2 │
└──────────────────────┘
3.12 - ARRAY_PREPEND
Prepends an element to the array.
Analyze Syntax
func.array_prepend( <element>, <array> )
Analyze Examples
func.array_prepend(1, [3, 4])
┌────────────────────────────────┐
│ func.array_prepend(1, [3, 4]) │
├────────────────────────────────┤
│ [1,3,4] │
└────────────────────────────────┘
SQL Syntax
ARRAY_PREPEND( <element>, <array> )
SQL Examples
SELECT ARRAY_PREPEND(1, [3, 4]);
┌──────────────────────────┐
│ array_prepend(1, [3, 4]) │
├──────────────────────────┤
│ [1,3,4] │
└──────────────────────────┘
3.13 - ARRAY_REDUCE
Applies iteratively the lambda function to the elements of the array, so as to reduce the array to a single value.
Analyze Syntax
func.array_reduce( <array>, <lambda> )
Analyze Examples
func.array_reduce([1, 2, 3, 4], (x, y) -> (x + y))
┌─────────────────────────────────────────────────────┐
│ func.array_reduce([1, 2, 3, 4], (x, y) -> (x + y)) │
├─────────────────────────────────────────────────────┤
│ 10 │
└─────────────────────────────────────────────────────┘
SQL Syntax
ARRAY_REDUCE( <array>, <lambda> )
SQL Examples
SELECT ARRAY_REDUCE([1, 2, 3, 4], (x,y) -> x + y);
┌───────────────────────────────────────────────┐
│ array_reduce([1, 2, 3, 4], (x, y) -> (x + y)) │
├───────────────────────────────────────────────┤
│ 10 │
└───────────────────────────────────────────────┘
3.14 - ARRAY_REMOVE_FIRST
Removes the first element from the array.
Analyze Syntax
func.array_remove_first( <array> )
Analyze Examples
func.array_remove_first([1, 2, 3])
┌─────────────────────────────────────┐
│ func.array_remove_first([1, 2, 3]) │
├─────────────────────────────────────┤
│ [2,3] │
└─────────────────────────────────────┘
SQL Syntax
ARRAY_REMOVE_FIRST( <array> )
SQL Examples
SELECT ARRAY_REMOVE_FIRST([1, 2, 3]);
┌───────────────────────────────┐
│ array_remove_first([1, 2, 3]) │
├───────────────────────────────┤
│ [2,3] │
└───────────────────────────────┘
3.15 - ARRAY_REMOVE_LAST
Removes the last element from the array.
Analyze Syntax
func.array_remove_last( <array> )
Analyze Examples
func.array_remove_last([1, 2, 3])
┌────────────────────────────────────┐
│ func.array_remove_last([1, 2, 3]) │
├────────────────────────────────────┤
│ [1,2] │
└────────────────────────────────────┘
SQL Syntax
ARRAY_REMOVE_LAST( <array> )
SQL Examples
SELECT ARRAY_REMOVE_LAST([1, 2, 3]);
┌──────────────────────────────┐
│ array_remove_last([1, 2, 3]) │
├──────────────────────────────┤
│ [1,2] │
└──────────────────────────────┘
3.16 - ARRAY_SLICE
Alias for SLICE.
3.17 - ARRAY_SORT
Sorts elements in the array in ascending order.
Analyze Syntax
func.array_sort( <array>[, <order>, <nullposition>] )
Parameter | Default | Description |
---|
order | ASC | Specifies the sorting order as either ascending (ASC) or descending (DESC). |
nullposition | NULLS FIRST | Determines the position of NULL values in the sorting result, at the beginning (NULLS FIRST) or at the end (NULLS LAST) of the sorting output. |
Analyze Examples
func.array_sort([1, 4, 3, 2])
┌────────────────────────────────┐
│ func.array_sort([1, 4, 3, 2]) │
├────────────────────────────────┤
│ [1,2,3,4] │
└────────────────────────────────┘
SQL Syntax
ARRAY_SORT( <array>[, <order>, <nullposition>] )
Parameter | Default | Description |
---|
order | ASC | Specifies the sorting order as either ascending (ASC) or descending (DESC). |
nullposition | NULLS FIRST | Determines the position of NULL values in the sorting result, at the beginning (NULLS FIRST) or at the end (NULLS LAST) of the sorting output. |
SQL Examples
SELECT ARRAY_SORT([1, 4, 3, 2]);
┌──────────────────────────┐
│ array_sort([1, 4, 3, 2]) │
├──────────────────────────┤
│ [1,2,3,4] │
└──────────────────────────┘
3.18 - ARRAY_TO_STRING
Concatenates elements of an array into a single string, using a specified separator.
Analyze Syntax
func.array_to_string( <array>, '<separator>' )
Analyze Examples
func.array_to_string(['apple', 'banana', 'cherry'], ', ')
┌────────────────────────────────────────────────────────────┐
│ func.array_to_string(['apple', 'banana', 'cherry'], ', ') │
├────────────────────────────────────────────────────────────┤
│ Apple, Banana, Cherry │
└────────────────────────────────────────────────────────────┘
SQL Syntax
ARRAY_TO_STRING( <array>, '<separator>' )
SQL Examples
SELECT ARRAY_TO_STRING(['Apple', 'Banana', 'Cherry'], ', ');
┌──────────────────────────────────────────────────────┐
│ array_to_string(['apple', 'banana', 'cherry'], ', ') │
├──────────────────────────────────────────────────────┤
│ Apple, Banana, Cherry │
└──────────────────────────────────────────────────────┘
3.19 - ARRAY_TRANSFORM
Returns an array that is the result of applying the lambda function to each element of the input array.
Analyze Syntax
func.array_transform( <array>, <lambda> )
Analyze Examples
func.array_transform([1, 2, 3], x -> (x + 1))
┌───────────────────────────────────────────────┐
│ func.array_transform([1, 2, 3], x -> (x + 1)) │
├───────────────────────────────────────────────┤
│ [2,3,4] │
└───────────────────────────────────────────────┘
SQL Syntax
ARRAY_TRANSFORM( <array>, <lambda> )
Aliases
SQL Examples
SELECT ARRAY_TRANSFORM([1, 2, 3], x -> x + 1);
┌──────────────────────────────────────────┐
│ array_transform([1, 2, 3], x -> (x + 1)) │
├──────────────────────────────────────────┤
│ [2,3,4] │
└──────────────────────────────────────────┘
3.20 - ARRAY_UNIQUE
Counts unique elements in the array (except NULL).
Analyze Syntax
func.array_unique( <array> )
Analyze Examples
func.array_unique([1, 2, 3, 3, 4])
┌─────────────────────────────────────┐
│ func.array_unique([1, 2, 3, 3, 4]) │
├─────────────────────────────────────┤
│ 4 │
└─────────────────────────────────────┘
SQL Syntax
SQL Examples
SELECT ARRAY_UNIQUE([1, 2, 3, 3, 4]);
┌───────────────────────────────┐
│ array_unique([1, 2, 3, 3, 4]) │
├───────────────────────────────┤
│ 4 │
└───────────────────────────────┘
3.21 - CONTAINS
Checks if the array contains a specific element.
Analyze Syntax
func.contains( <array>, <element> )
Analyze Examples
func.contains([1, 2], 1)
┌───────────────────────────┐
│ func.contains([1, 2], 1) │
├───────────────────────────┤
│ true │
└───────────────────────────┘
SQL Syntax
CONTAINS( <array>, <element> )
Aliases
SQL Examples
SELECT ARRAY_CONTAINS([1, 2], 1), CONTAINS([1, 2], 1);
┌─────────────────────────────────────────────────┐
│ array_contains([1, 2], 1) │ contains([1, 2], 1) │
├───────────────────────────┼─────────────────────┤
│ true │ true │
└─────────────────────────────────────────────────┘
3.22 - GET
Returns an element from an array by index (1-based).
Analyze Syntax
func.get( <array>, <index> )
Analyze Examples
func.get([1, 2], 2)
┌─────────────────────┐
│ func.get([1, 2], 2) │
├─────────────────────┤
│ 2 │
└─────────────────────┘
SQL Syntax
Aliases
SQL Examples
SELECT GET([1, 2], 2), ARRAY_GET([1, 2], 2);
┌───────────────────────────────────────┐
│ get([1, 2], 2) │ array_get([1, 2], 2) │
├────────────────┼──────────────────────┤
│ 2 │ 2 │
└───────────────────────────────────────┘
3.23 - RANGE
Returns an array collected by [start, end).
Analyze Syntax
func.range( <start>, <end> )
SQAnalyzeL Examples
func.range(1, 5)
┌────────────────────┐
│ func.range(1, 5) │
├────────────────────┤
│ [1,2,3,4] │
└────────────────────┘
SQL Syntax
SQL Examples
SELECT RANGE(1, 5);
┌───────────────┐
│ range(1, 5) │
├───────────────┤
│ [1,2,3,4] │
└───────────────┘
3.24 - SLICE
Extracts a slice from the array by index (1-based).
Analyze Syntax
func.slice( <array>, <start>[, <end>] )
Analyze Examples
func.slice([1, 21, 32, 4], 2, 3)
┌──────────────────────────────────┐
│ func.slice([1, 21, 32, 4], 2, 3) │
├──────────────────────────────────┤
│ [21,32] │
└──────────────────────────────────┘
SQL Syntax
SLICE( <array>, <start>[, <end>] )
Aliases
SQL Examples
SELECT ARRAY_SLICE([1, 21, 32, 4], 2, 3), SLICE([1, 21, 32, 4], 2, 3);
┌─────────────────────────────────────────────────────────────────┐
│ array_slice([1, 21, 32, 4], 2, 3) │ slice([1, 21, 32, 4], 2, 3) │
├───────────────────────────────────┼─────────────────────────────┤
│ [21,32] │ [21,32] │
└─────────────────────────────────────────────────────────────────┘
3.25 - UNNEST
Unnests the array and returns the set of elements.
Analyze Syntax
Analyze Examples
func.unnest([1, 2])
┌──────────────────────┐
│ func.unnest([1, 2]) │
├──────────────────────┤
│ 1 │
│ 2 │
└──────────────────────┘
SQL Syntax
SQL Examples
SELECT UNNEST([1, 2]);
┌─────────────────┐
│ unnest([1, 2]) │
├─────────────────┤
│ 1 │
│ 2 │
└─────────────────┘
-- UNNEST(array) can be used as a table function.
SELECT * FROM UNNEST([1, 2]);
┌─────────────────┐
│ value │
├─────────────────┤
│ 1 │
│ 2 │
└─────────────────┘
A Practical Example
In the examples below, we will use the following table called contacts with the phones column defined with an array of text.
CREATE TABLE contacts (
id SERIAL PRIMARY KEY,
name VARCHAR (100),
phones TEXT []
);
The phones column is a one-dimensional array that holds various phone numbers that a contact may have.
To define multiple dimensional array, you add the square brackets.
For example, you can define a two-dimensional array as follows:
column_name data_type [][]
An example of inserting data into that table
INSERT INTO contacts (name, phones)
VALUES('John Doe',ARRAY [ '(408)-589-5846','(408)-589-5555' ]);
or
INSERT INTO contacts (name, phones)
VALUES('Lily Bush','{"(408)-589-5841"}'),
('William Gate','{"(408)-589-5842","(408)-589-5843"}');
The unnest() function expands an array to a list of rows. For example, the following query expands all phone numbers of the phones array.
SELECT
name,
unnest(phones)
FROM
contacts;
Output:
name | unnest |
---|
John Doe | (408)-589-5846 |
John Doe | (408)-589-5555 |
Lily Bush | (408)-589-5841 |
William Gate | (408)-589-5843 |
4 - Bitmap Functions
This section provides reference information for the bitmap functions in PlaidCloud Lakehouse.
4.1 - BITMAP_AND
Performs a bitwise AND operation on the two bitmaps.
Analyze Syntax
func.bitmap_and( <bitmap1>, <bitmap2> )
Analyze Examples
func.bitmap_and(func.build_bitmap([1, 4, 5]), func.cast(build_bitmap([4, 5])), string)
┌────────────────────────────────────────────────────────────────────────────────────────┐
│ func.bitmap_and(func.build_bitmap([1, 4, 5]), func.cast(build_bitmap([4, 5])), string) │
├────────────────────────────────────────────────────────────────────────────────────────┤
│ 4,5 │
└────────────────────────────────────────────────────────────────────────────────────────┘
SQL Syntax
BITMAP_AND( <bitmap1>, <bitmap2> )
SQL Examples
SELECT BITMAP_AND(BUILD_BITMAP([1,4,5]), BUILD_BITMAP([4,5]))::String;
┌───────────────────────────────────────────────────────────────────┐
│ bitmap_and(build_bitmap([1, 4, 5]), build_bitmap([4, 5]))::string │
├───────────────────────────────────────────────────────────────────┤
│ 4,5 │
└───────────────────────────────────────────────────────────────────┘
4.2 - BITMAP_AND_COUNT
Counts the number of bits set to 1 in the bitmap by performing a logical AND operation.
Analyze Syntax
func.bitmap_and_count( <bitmap> )
Analyze Examples
func.bitmap_and_count(to_bitmap('1, 3, 5'))
┌─────────────────────────────────────────────┐
│ func.bitmap_and_count(to_bitmap('1, 3, 5')) │
├─────────────────────────────────────────────┤
│ 3 │
└─────────────────────────────────────────────┘
SQL Syntax
BITMAP_AND_COUNT( <bitmap> )
SQL Examples
SELECT BITMAP_AND_COUNT(TO_BITMAP('1, 3, 5'));
┌────────────────────────────────────────┐
│ bitmap_and_count(to_bitmap('1, 3, 5')) │
├────────────────────────────────────────┤
│ 3 │
└────────────────────────────────────────┘
4.3 - BITMAP_AND_NOT
Alias for BITMAP_NOT.
4.4 - BITMAP_CARDINALITY
Alias for BITMAP_COUNT.
4.5 - BITMAP_CONTAINS
Checks if the bitmap contains a specific value.
Analyze Syntax
func.bitmap_contains( <bitmap>, <value> )
Analyze Examples
func.bitmap_contains(build_bitmap([1, 4, 5]), 1)
┌───────────────────────────────────────────────────┐
│ func.bitmap_contains(build_bitmap([1, 4, 5]), 1) │
├───────────────────────────────────────────────────┤
│ true │
└───────────────────────────────────────────────────┘
SQL Syntax
BITMAP_CONTAINS( <bitmap>, <value> )
SQL Examples
SELECT BITMAP_CONTAINS(BUILD_BITMAP([1,4,5]), 1);
┌─────────────────────────────────────────────┐
│ bitmap_contains(build_bitmap([1, 4, 5]), 1) │
├─────────────────────────────────────────────┤
│ true │
└─────────────────────────────────────────────┘
4.6 - BITMAP_COUNT
Counts the number of bits set to 1 in the bitmap.
Analyze Syntax
func.bitmap_count( <bitmap> )
Analyze Examples
func.bitmap_count(build_bitmap([1, 4, 5]))
┌────────────────────────────────────────────┐
│ func.bitmap_count(build_bitmap([1, 4, 5])) │
├────────────────────────────────────────────┤
│ 3 │
└────────────────────────────────────────────┘
SQL Syntax
Aliases
SQL Examples
SELECT BITMAP_COUNT(BUILD_BITMAP([1,4,5])), BITMAP_CARDINALITY(BUILD_BITMAP([1,4,5]));
┌─────────────────────────────────────────────────────────────────────────────────────┐
│ bitmap_count(build_bitmap([1, 4, 5])) │ bitmap_cardinality(build_bitmap([1, 4, 5])) │
├───────────────────────────────────────┼─────────────────────────────────────────────┤
│ 3 │ 3 │
└─────────────────────────────────────────────────────────────────────────────────────┘
4.7 - BITMAP_HAS_ALL
Checks if the first bitmap contains all the bits in the second bitmap.
Analyze Syntax
func.bitmap_has_all( <bitmap1>, <bitmap2> )
Analyze Examples
func.bitmap_has_all(build_bitmap([1, 4, 5]), build_bitmap([1, 2]))
┌─────────────────────────────────────────────────────────────────────┐
│ func.bitmap_has_all(build_bitmap([1, 4, 5]), build_bitmap([1, 2])) │
├─────────────────────────────────────────────────────────────────────┤
│ false │
└─────────────────────────────────────────────────────────────────────┘
SQL Syntax
BITMAP_HAS_ALL( <bitmap1>, <bitmap2> )
SQL Examples
SELECT BITMAP_HAS_ALL(BUILD_BITMAP([1,4,5]), BUILD_BITMAP([1,2]));
┌───────────────────────────────────────────────────────────────┐
│ bitmap_has_all(build_bitmap([1, 4, 5]), build_bitmap([1, 2])) │
├───────────────────────────────────────────────────────────────┤
│ false │
└───────────────────────────────────────────────────────────────┘
4.8 - BITMAP_HAS_ANY
Checks if the first bitmap has any bit matching the bits in the second bitmap.
Analyze Syntax
func.bitmap_has_any( <bitmap1>, <bitmap2> )
Analyze Examples
func.bitmap_has_any(func.build_bitmap([1, 4, 5]), func.build_bitmap([1, 2]))
┌───────────────────────────────────────────────────────────────────────────────┐
│ func.bitmap_has_any(func.build_bitmap([1, 4, 5]), func.build_bitmap([1, 2])) │
├───────────────────────────────────────────────────────────────────────────────┤
│ true │
└───────────────────────────────────────────────────────────────────────────────┘
SQL Syntax
BITMAP_HAS_ANY( <bitmap1>, <bitmap2> )
SQL Examples
SELECT BITMAP_HAS_ANY(BUILD_BITMAP([1,4,5]), BUILD_BITMAP([1,2]));
┌───────────────────────────────────────────────────────────────┐
│ bitmap_has_any(build_bitmap([1, 4, 5]), build_bitmap([1, 2])) │
├───────────────────────────────────────────────────────────────┤
│ true │
└───────────────────────────────────────────────────────────────┘
4.9 - BITMAP_INTERSECT
Counts the number of bits set to 1 in the bitmap by performing a logical INTERSECT operation.
Analyze Syntax
func.bitmap_intersect( <bitmap> )
Analyze Examples
func.bitmap_intersect(func.to_bitmap('1, 3, 5'))
┌──────────────────────────────────────────────────┐
│ func.bitmap_intersect(func.to_bitmap('1, 3, 5')) │
├──────────────────────────────────────────────────┤
│ 1,3,5 │
└──────────────────────────────────────────────────┘
SQL Syntax
BITMAP_INTERSECT( <bitmap> )
SQL Examples
SELECT BITMAP_INTERSECT(TO_BITMAP('1, 3, 5'))::String;
┌────────────────────────────────────────────────┐
│ bitmap_intersect(to_bitmap('1, 3, 5'))::string │
├────────────────────────────────────────────────┤
│ 1,3,5 │
└────────────────────────────────────────────────┘
4.10 - BITMAP_MAX
Gets the maximum value in the bitmap.
Analyze Syntax
func.bitmap_max( <bitmap> )
Analyze Examples
func.bitmap_max(func.build_bitmap([1, 4, 5]))
┌───────────────────────────────────────────────┐
│ func.bitmap_max(func.build_bitmap([1, 4, 5])) │
├───────────────────────────────────────────────┤
│ 5 │
└───────────────────────────────────────────────┘
SQL Syntax
SQL Examples
SELECT BITMAP_MAX(BUILD_BITMAP([1,4,5]));
┌─────────────────────────────────────┐
│ bitmap_max(build_bitmap([1, 4, 5])) │
├─────────────────────────────────────┤
│ 5 │
└─────────────────────────────────────┘
4.11 - BITMAP_MIN
Gets the minimum value in the bitmap.
Analyze Syntax
func.bitmap_min( <bitmap> )
Analyze Examples
func.bitmap_min(func.build_bitmap([1, 4, 5]))
┌───────────────────────────────────────────────┐
│ func.bitmap_min(func.build_bitmap([1, 4, 5])) │
├───────────────────────────────────────────────┤
│ 1 │
└───────────────────────────────────────────────┘
SQL Syntax
SQL Examples
SELECT BITMAP_MIN(BUILD_BITMAP([1,4,5]));
┌─────────────────────────────────────┐
│ bitmap_min(build_bitmap([1, 4, 5])) │
├─────────────────────────────────────┤
│ 1 │
└─────────────────────────────────────┘
4.12 - BITMAP_NOT
Generates a new bitmap with elements from the first bitmap that are not in the second one.
Analyze Syntax
func.bitmap_not( <bitmap1>, <bitmap2> )
Analyze Examples
func.bitmap_not(func.build_bitmap([1, 4, 5]), func.cast(func.build_bitmap([5, 6, 7])), Text)
┌───────────────────────────────────────────────────────────────────────────────────────────────┐
│ func.bitmap_not(func.build_bitmap([1, 4, 5]), func.cast(func.build_bitmap([5, 6, 7])), Text) │
├───────────────────────────────────────────────────────────────────────────────────────────────┤
│ 1,4 │
└───────────────────────────────────────────────────────────────────────────────────────────────┘
SQL Syntax
BITMAP_NOT( <bitmap1>, <bitmap2> )
Aliases
SQL Examples
SELECT BITMAP_NOT(BUILD_BITMAP([1,4,5]), BUILD_BITMAP([5,6,7]))::String;
┌──────────────────────────────────────────────────────────────────────┐
│ bitmap_not(build_bitmap([1, 4, 5]), build_bitmap([5, 6, 7]))::string │
├──────────────────────────────────────────────────────────────────────┤
│ 1,4 │
└──────────────────────────────────────────────────────────────────────┘
SELECT BITMAP_AND_NOT(BUILD_BITMAP([1,4,5]), BUILD_BITMAP([5,6,7]))::String;
┌──────────────────────────────────────────────────────────────────────────┐
│ bitmap_and_not(build_bitmap([1, 4, 5]), build_bitmap([5, 6, 7]))::string │
├──────────────────────────────────────────────────────────────────────────┤
│ 1,4 │
└──────────────────────────────────────────────────────────────────────────┘
4.13 - BITMAP_NOT_COUNT
Counts the number of bits set to 0 in the bitmap by performing a logical NOT operation.
Analyze Syntax
func.bitmap_not_count( <bitmap> )
Analyze Examples
func.bitmap_not_count(func.to_bitmap('1, 3, 5'))
┌──────────────────────────────────────────────────┐
│ func.bitmap_not_count(func.to_bitmap('1, 3, 5')) │
├──────────────────────────────────────────────────┤
│ 3 │
└──────────────────────────────────────────────────┘
SQL Syntax
BITMAP_NOT_COUNT( <bitmap> )
SQL Examples
SELECT BITMAP_NOT_COUNT(TO_BITMAP('1, 3, 5'));
┌────────────────────────────────────────┐
│ bitmap_not_count(to_bitmap('1, 3, 5')) │
├────────────────────────────────────────┤
│ 3 │
└────────────────────────────────────────┘
4.14 - BITMAP_OR
Performs a bitwise OR operation on the two bitmaps.
Analyze Syntax
func.bitmap_or( <bitmap1>, <bitmap2> )
Analyze Examples
func.bitmap_or(func.build_bitmap([1, 4, 5]), func.build_bitmap([6, 7]))
┌─────────────────────────────────────────────────────────────────────────┐
│ func.bitmap_or(func.build_bitmap([1, 4, 5]), func.build_bitmap([6, 7])) │
├─────────────────────────────────────────────────────────────────────────┤
│ 1,4,5,6,7 │
└─────────────────────────────────────────────────────────────────────────┘
SQL Syntax
BITMAP_OR( <bitmap1>, <bitmap2> )
SQL Examples
SELECT BITMAP_OR(BUILD_BITMAP([1,4,5]), BUILD_BITMAP([6,7]))::String;
┌──────────────────────────────────────────────────────────────────┐
│ bitmap_or(build_bitmap([1, 4, 5]), build_bitmap([6, 7]))::string │
├──────────────────────────────────────────────────────────────────┤
│ 1,4,5,6,7 │
└──────────────────────────────────────────────────────────────────┘
4.15 - BITMAP_OR_COUNT
Counts the number of bits set to 1 in the bitmap by performing a logical OR operation.
Analyze Syntax
func.bitmap_or_count( <bitmap> )
Analyze Examples
func.bitmap_or_count(func.to_bitmap('1, 3, 5'))
┌─────────────────────────────────────────────────┐
│ func.bitmap_or_count(func.to_bitmap('1, 3, 5')) │
├─────────────────────────────────────────────────┤
│ 3 │
└─────────────────────────────────────────────────┘
SQL Syntax
BITMAP_OR_COUNT( <bitmap> )
SQL Examples
SELECT BITMAP_OR_COUNT(TO_BITMAP('1, 3, 5'));
┌───────────────────────────────────────┐
│ bitmap_or_count(to_bitmap('1, 3, 5')) │
├───────────────────────────────────────┤
│ 3 │
└───────────────────────────────────────┘
4.16 - BITMAP_SUBSET_IN_RANGE
Generates a sub-bitmap of the source bitmap within a specified range.
Analyze Syntax
func.bitmap_subset_in_range( <bitmap>, <start>, <end> )
Analyze Examples
func.bitmap_subset_in_range(func.build_bitmap([5, 7, 9]), 6, 9)
┌─────────────────────────────────────────────────────────────────┐
│ func.bitmap_subset_in_range(func.build_bitmap([5, 7, 9]), 6, 9) │
├─────────────────────────────────────────────────────────────────┤
│ 7 │
└─────────────────────────────────────────────────────────────────┘
SQL Syntax
BITMAP_SUBSET_IN_RANGE( <bitmap>, <start>, <end> )
SQL Examples
SELECT BITMAP_SUBSET_IN_RANGE(BUILD_BITMAP([5,7,9]), 6, 9)::String;
┌───────────────────────────────────────────────────────────────┐
│ bitmap_subset_in_range(build_bitmap([5, 7, 9]), 6, 9)::string │
├───────────────────────────────────────────────────────────────┤
│ 7 │
└───────────────────────────────────────────────────────────────┘
4.17 - BITMAP_SUBSET_LIMIT
Generates a sub-bitmap of the source bitmap, beginning with a range from the start value, with a size limit.
Analyze Syntax
func.bitmap_subset_limit( <bitmap>, <start>, <limit> )
Analyze Examples
func.bitmap_subset_limit(func.build_bitmap([1, 4, 5]), 2, 2)
┌──────────────────────────────────────────────────────────────┐
│ func.bitmap_subset_limit(func.build_bitmap([1, 4, 5]), 2, 2) │
├──────────────────────────────────────────────────────────────┤
│ 4,5 │
└──────────────────────────────────────────────────────────────┘
SQL Syntax
BITMAP_SUBSET_LIMIT( <bitmap>, <start>, <limit> )
SQL Examples
SELECT BITMAP_SUBSET_LIMIT(BUILD_BITMAP([1,4,5]), 2, 2)::String;
┌────────────────────────────────────────────────────────────┐
│ bitmap_subset_limit(build_bitmap([1, 4, 5]), 2, 2)::string │
├────────────────────────────────────────────────────────────┤
│ 4,5 │
└────────────────────────────────────────────────────────────┘
4.18 - BITMAP_UNION
Counts the number of bits set to 1 in the bitmap by performing a logical UNION operation.
Analyze Syntax
func.bitmap_union( <bitmap> )
Analyze Examples
func.bitmap_union(func.to_bitmap('1, 3, 5'))
┌──────────────────────────────────────────────┐
│ func.bitmap_union(func.to_bitmap('1, 3, 5')) │
├──────────────────────────────────────────────┤
│ 1,3,5 │
└──────────────────────────────────────────────┘
SQL Syntax
SQL Examples
SELECT BITMAP_UNION(TO_BITMAP('1, 3, 5'))::String;
┌────────────────────────────────────────────┐
│ bitmap_union(to_bitmap('1, 3, 5'))::string │
├────────────────────────────────────────────┤
│ 1,3,5 │
└────────────────────────────────────────────┘
4.19 - BITMAP_XOR
Performs a bitwise XOR (exclusive OR) operation on the two bitmaps.
Analyze Syntax
func.bitmap_xor( <bitmap1>, <bitmap2> )
Analyze Examples
func.bitmap_xor(func.build_bitmap([1, 4, 5]), func.build_bitmap([5, 6, 7]))
┌─────────────────────────────────────────────────────────────────────────────┐
│ func.bitmap_xor(func.build_bitmap([1, 4, 5]), func.build_bitmap([5, 6, 7])) │
├─────────────────────────────────────────────────────────────────────────────┤
│ 1,4,6,7 │
└─────────────────────────────────────────────────────────────────────────────┘
SQL Syntax
BITMAP_XOR( <bitmap1>, <bitmap2> )
SQL Examples
SELECT BITMAP_XOR(BUILD_BITMAP([1,4,5]), BUILD_BITMAP([5,6,7]))::String;
┌──────────────────────────────────────────────────────────────────────┐
│ bitmap_xor(build_bitmap([1, 4, 5]), build_bitmap([5, 6, 7]))::string │
├──────────────────────────────────────────────────────────────────────┤
│ 1,4,6,7 │
└──────────────────────────────────────────────────────────────────────┘
4.20 - BITMAP_XOR_COUNT
Counts the number of bits set to 1 in the bitmap by performing a logical XOR (exclusive OR) operation.
Analyze Syntax
func.bitmap_xor_count( <bitmap> )
Analyze Examples
func.bitmap_xor_count(func.to_bitmap('1, 3, 5'))
┌──────────────────────────────────────────────────┐
│ func.bitmap_xor_count(func.to_bitmap('1, 3, 5')) │
├──────────────────────────────────────────────────┤
│ 3 │
└──────────────────────────────────────────────────┘
SQL Syntax
BITMAP_XOR_COUNT( <bitmap> )
SQL Examples
SELECT BITMAP_XOR_COUNT(TO_BITMAP('1, 3, 5'));
┌────────────────────────────────────────┐
│ bitmap_xor_count(to_bitmap('1, 3, 5')) │
├────────────────────────────────────────┤
│ 3 │
└────────────────────────────────────────┘
4.21 - INTERSECT_COUNT
Counts the number of intersecting bits between two bitmap columns.
Analyze Syntax
func.intersect_count(( '<bitmap1>', '<bitmap2>' ), ( <bitmap_column1>, <bitmap_column2> ))
Analyze Examples
# Given a dataset like this:
┌───────────────────────────────────────┐
│ id │ tag │ v │
├─────────────────┼─────────────────────┤
│ 1 │ a │ 0, 1 │
│ 3 │ b │ 0, 1, 2 │
│ 2 │ c │ 1, 3, 4 │
└───────────────────────────────────────┘
# This is produced
func.intersect_count(('b', 'c'), (v, tag))
┌──────────────────────────────────────────────────────────┐
│ id │ func.intersect_count('b', 'c')(v, tag) │
├─────────────────┼────────────────────────────────────────┤
│ 1 │ 0 │
│ 3 │ 3 │
│ 2 │ 3 │
└──────────────────────────────────────────────────────────┘
SQL Syntax
INTERSECT_COUNT( '<bitmap1>', '<bitmap2>' )( <bitmap_column1>, <bitmap_column2> )
SQL Examples
CREATE TABLE agg_bitmap_test(id Int, tag String, v Bitmap);
INSERT INTO
agg_bitmap_test(id, tag, v)
VALUES
(1, 'a', to_bitmap('0, 1')),
(2, 'b', to_bitmap('0, 1, 2')),
(3, 'c', to_bitmap('1, 3, 4'));
SELECT id, INTERSECT_COUNT('b', 'c')(v, tag)
FROM agg_bitmap_test GROUP BY id;
┌─────────────────────────────────────────────────────┐
│ id │ intersect_count('b', 'c')(v, tag) │
├─────────────────┼───────────────────────────────────┤
│ 1 │ 0 │
│ 3 │ 3 │
│ 2 │ 3 │
└─────────────────────────────────────────────────────┘
4.22 - SUB_BITMAP
Generates a sub-bitmap of the source bitmap, beginning from the start index, with a specified size.
Analyze Syntax
func.sub_bitmap( <bitmap>, <start>, <size> )
Analyze Examples
func.sub_bitmap(func.build_bitmap([1, 2, 3, 4, 5]), 1, 3)
┌───────────────────────────────────────────────────────────┐
│ func.sub_bitmap(func.build_bitmap([1, 2, 3, 4, 5]), 1, 3) │
├───────────────────────────────────────────────────────────┤
│ 2,3,4 │
└───────────────────────────────────────────────────────────┘
SQL Syntax
SUB_BITMAP( <bitmap>, <start>, <size> )
SQL Examples
SELECT SUB_BITMAP(BUILD_BITMAP([1, 2, 3, 4, 5]), 1, 3)::String;
┌─────────────────────────────────────────────────────────┐
│ sub_bitmap(build_bitmap([1, 2, 3, 4, 5]), 1, 3)::string │
├─────────────────────────────────────────────────────────┤
│ 2,3,4 │
└─────────────────────────────────────────────────────────┘
5 - Conditional Functions
This section provides reference information for the conditional functions in PlaidCloud Lakehouse.
5.1 - [ NOT ] BETWEEN
Returns true
if the given numeric or string <expr>
falls inside the defined lower and upper limits.
Analyze Syntax
table.column.between(<lower_limit>, <upper_limit>
Analyze Examples
table.column.between(0, 5)
SQL Syntax
<expr> [ NOT ] BETWEEN <lower_limit> AND <upper_limit>
SQL Examples
SELECT 'true' WHERE 5 BETWEEN 0 AND 5;
┌────────┐
│ 'true' │
├────────┤
│ true │
└────────┘
SELECT 'true' WHERE 'data' BETWEEN 'data' AND 'databendcloud';
┌────────┐
│ 'true' │
├────────┤
│ true │
└────────┘
5.2 - [ NOT ] IN
Checks whether a value is (or is not) in an explicit list.
Analyze Syntax
table.columns.in_((<value1>, <value2> ...))
Analyze Examples
table.columns.in_((<value1>, <value2> ...))
┌──────────────────────────┐
│ table.column.in_((2, 3)) │
├──────────────────────────┤
│ true │
└──────────────────────────┘
SQL Syntax
<value> [ NOT ] IN (<value1>, <value2> ...)
SQL Examples
SELECT 1 NOT IN (2, 3);
┌────────────────┐
│ 1 not in(2, 3) │
├────────────────┤
│ true │
└────────────────┘
5.3 - AND
Conditional AND operator. Checks whether both conditions are true.
Analyze Syntax
and_(<expr1>[, <expr2> ...])
Analyze Examples
and_(
table.color == 'green',
table.shape == 'circle',
table.price >= 1.25
)
SQL Syntax
SQL Examples
SELECT * FROM table WHERE
table.color = 'green'
AND table.shape = 'circle'
AND table.price >= 1.25;
5.4 - CASE
Handles IF/THEN logic. It is structured with at least one pair of WHEN
and THEN
statements. Every CASE
statement must be concluded with the END
keyword. The ELSE
statement is optional, providing a way to capture values not explicitly specified in the WHEN
and THEN
statements.
SQL Syntax
case(
(<condition_1>, <value_1>),
(<condition_2>, <value_2>),
[ ... ]
[ else_=<value_n>]
)
Analyze Examples
A simple example
This example returns a person's name. It starts off searching to see if the first name column has a value (the "if"). If there is a value, concatenate the first name with the last name and return it (the "then"). If there isn't a first name, then return the last name only (the "else").
case(
(table.first_name.is_not(None), func.concat(table.first_name, table.last_name)),
else_=table.last_name
)
A more complex example with multiple conditions
This example returns a price based on quantity. "If" the quantity in the order is more than 100, then give the customer the special price. If it doesn't satisfy the first condition, go to the second. If the quantity is greater than 10 (11-100), then give the customer the bulk price. Otherwise give the customer the regular price.
case(
(order_table.qty > 100, item_table.specialprice),
(order_table.qty > 10, item_table.bulkprice),
else_=item_table.regularprice
)
This example returns the first initial of the person's first name. If the user's name is wendy, return W. Otherwise if the user's name is jack, return J. Otherwise return E.
case(
(users_table.name == "wendy", "W"),
(users_table.name == "jack", "J"),
else_='E'
)
The above may also be written in shorthand as:
case(
{"wendy": "W", "jack": "J"},
value=users_table.name,
else_='E'
)
Other Examples
In this example is from a Table:Lookup step where we are updating the "dock_final" column when the table1.dock_final value is Null.
case(
(table1.dock_final == Null, table2.dock_final),
else_ = table1.dock_final
)
This example is from a Table:Lookup step where we are updating the "Marketing Channel" column when "Marketing Channel" in table1 is not 'none' or the "Serial Number" contains a '_'.
case(
(get_column(table1, 'Marketing Channel') != 'none', get_column(table1, 'Marketing Channel')),
(get_column(table1, 'Serial Number').contains('_'), get_column(table1, 'Marketing Channel')),
(get_column(table2, 'Marketing Channel').is_not(Null), get_column(table2, 'Marketing Channel')),
else_ = 'none'
)
SQL Syntax
CASE
WHEN <condition_1> THEN <value_1>
[ WHEN <condition_2> THEN <value_2> ]
[ ... ]
[ ELSE <value_n> ]
END AS <column_name>
SQL Examples
This example categorizes employee salaries using a CASE statement, presenting details with a dynamically assigned column named "SalaryCategory":
-- Create a sample table
CREATE TABLE Employee (
EmployeeID INT,
FirstName VARCHAR(50),
LastName VARCHAR(50),
Salary INT
);
-- Insert some sample data
INSERT INTO Employee VALUES (1, 'John', 'Doe', 50000);
INSERT INTO Employee VALUES (2, 'Jane', 'Smith', 60000);
INSERT INTO Employee VALUES (3, 'Bob', 'Johnson', 75000);
INSERT INTO Employee VALUES (4, 'Alice', 'Williams', 90000);
-- Add a new column 'SalaryCategory' using CASE statement
-- Categorize employees based on their salary
SELECT
EmployeeID,
FirstName,
LastName,
Salary,
CASE
WHEN Salary < 60000 THEN 'Low'
WHEN Salary >= 60000 AND Salary < 80000 THEN 'Medium'
WHEN Salary >= 80000 THEN 'High'
ELSE 'Unknown'
END AS SalaryCategory
FROM
Employee;
┌──────────────────────────────────────────────────────────────────────────────────────────┐
│ employeeid │ firstname │ lastname │ salary │ salarycategory │
├─────────────────┼──────────────────┼──────────────────┼─────────────────┼────────────────┤
│ 1 │ John │ Doe │ 50000 │ Low │
│ 2 │ Jane │ Smith │ 60000 │ Medium │
│ 4 │ Alice │ Williams │ 90000 │ High │
│ 3 │ Bob │ Johnson │ 75000 │ Medium │
└──────────────────────────────────────────────────────────────────────────────────────────┘
5.5 - COALESCE
Returns the first non-NULL expression within its arguments; if all arguments are NULL, it returns NULL.
Analyze Syntax
func.coalesce(<expr1>[, <expr2> ...])
Analyze Examples
func.coalesce(table.UOM, 'none', \n)
func.coalesce(get_column(table2, 'TECHNOLOGY_RATE'), 0.0)
func.coalesce(table_beta.adjusted_price, table_alpha.override_price, table_alpha.price) * table_beta.quantity_sold
SQL Syntax
COALESCE(<expr1>[, <expr2> ...])
SQL Examples
SELECT COALESCE(1), COALESCE(1, NULL), COALESCE(NULL, 1, 2);
┌────────────────────────────────────────────────────────┐
│ coalesce(1) │ coalesce(1, null) │ coalesce(null, 1, 2) │
├─────────────┼───────────────────┼──────────────────────┤
│ 1 │ 1 │ 1 │
└────────────────────────────────────────────────────────┘
SELECT COALESCE('a'), COALESCE('a', NULL), COALESCE(NULL, 'a', 'b');
┌────────────────────────────────────────────────────────────────┐
│ coalesce('a') │ coalesce('a', null) │ coalesce(null, 'a', 'b') │
├───────────────┼─────────────────────┼──────────────────────────┤
│ a │ a │ a │
└────────────────────────────────────────────────────────────────┘
5.6 - Comparison Methods
These comparison methods are available in Analyze expressions.
Category | Expression | Structure | Example | Description |
---|
General Usage | > | > | table.column > 23 | Greater Than |
General Usage | < | < | table.column < 23 | Less Than |
General Usage | >= | >= | table.column >= 23 | Greater than or equal to |
General Usage | <= | <= | table.column <= 23 | Less than or equal to |
General Usage | == | == | table.column == 23 | Equal to |
General Usage | != | != | table.column != 23 | Not Equal to |
General Usage | and_ | and_() | and_(table.a > 23, table.b == u'blue') Additional Examples | Creates an AND SQL condition |
General Usage | any_ | any_() | table.column.any(('red', 'blue', 'yellow')) | Applies the SQL ANY() condition to a column |
General Usage | between | between | table.column.between(23, 46)
get_column(table, 'LAST_CHANGED_DATE').between({start_date}, {end_date}) | Applies the SQL BETWEEN condition |
General Usage | contains | contains | table.column.contains('mno')
table.SOURCE_SYSTEM.contains('TEST') | Applies the SQL LIKE '%%' |
General Usage | endswith | endswith | table.column.endswith('xyz')
table.Parent.endswith(':EBITX')
table.PERIOD.endswith("01") | Applies the SQL LIKE '%%' |
General Usage | FALSE | FALSE | FALSE | False, false, FALSE - Alias for Python False |
General Usage | ilike | ilike | table.column.ilike('%foobar%') | Applies the SQL ILIKE method |
General Usage | in_ | in_() | table.column.in_((1, 2, 3))
get_column(table, 'Source Country').in_(['CN','SG','BR'])
table.MONTH.in_(['01','02','03','04','05','06','07','08','09']) | Test if values are with a tuple of values |
General Usage | is_ | is_ | table.column.is_(None)
get_column(table, 'Min SafetyStock').is_(None)
get_column(table, 'date_pod').is_(None) | Applies the SQL is the IS for things like IS NULL |
General Usage | isnot | isnot | table.column.isnot(None) | Applies the SQL is the IS for things like IS NOT NULL |
General Usage | like | like | table.column.like('%foobar%')
table.SOURCE_SYSTEM.like('%Adjustments%') | Applies the SQL LIKE method |
General Usage | not_ | not_() | not_(and_(table.a > 23, table.b == u'blue')) | Inverts the condition |
General Usage | notilike | notilike | table.column.notilike('%foobar%') | Applies the SQL NOT ILIKE method |
General Usage | notin | notin | table.column.notin((1, 2, 3))
table.LE.notin_(['12345','67890']) | Inverts the IN condition |
General Usage | notlike | notlike | table.column.notlike('%foobar%') | Applies the SQL NOT LIKE method |
General Usage | NULL | NULL | NULL | Null, null, NULL - Alias for Python None |
General Usage | or_ | or_() | or_(table.a > 23, table.b == u'blue') Additional Examples | Creates an OR SQL condition |
General Usage | startswith | startswith | table.column.startswith('abc')
get_column(table, 'Zip Code').startswith('9')
get_column(table1, 'GL Account').startswith('CORP') | Applies the SQL LIKE '%' |
General Usage | TRUE | TRUE | TRUE | True, true, TRUE - Alias for Python True |
Math Expressions | + | + | + | 2+3=5 |
Math Expressions | – | – | - | 2–3=-1 |
Math Expressions | * | * | * | 2*3=6 |
Math Expressions | / | / | / | 4/2=2 |
Math Expressions | column.op | column.op(operator) | column.op('%') | 5%4=1 |
Math Expressions | column.op | column.op(operator) | column.op('^') | 2.0^3.0=8 |
Math Expressions | column.op | column.op(operator) | column.op('!') | 5!=120 |
Math Expressions | column.op | column.op(operator) | column.op('!!') | !!5=120 |
Math Expressions | column.op | column.op(operator) | column.op('@') | @-5.0=5 |
Math Expressions | column.op | column.op(operator) | column.op('&') | 91&15=11 |
Math Expressions | column.op | column.op(operator) | column.op('#') | 17##5=20 |
Math Expressions | column.op | column.op(operator) | column.op('~') | ~1=-2 |
Math Expressions | column.op | column.op(operator) | column.op('<<') | 1<<4=16 |
Math Expressions | column.op | column.op(operator) | column.op('>>') | 8>>2=2 |
5.7 - ERROR_OR
Returns the first non-error expression among its inputs. If all expressions result in errors, it returns NULL.
Analyze Syntax
func.error_or(expr1, expr2, ...)
Analyze Examples
# Returns the valid date if no errors occur
# Returns the current date if the conversion results in an error
func.now(), func.error_or(func.to_date('2024-12-25'), func.now())
┌──────────────────────────────────────────────────────────────────────────────────────────┐
│ func.now() │ func.error_or(func.to_date('2024-12-25'), func.now()) │
├─────────────────────────────────┼────────────────────────────────────────────────────────┤
│ 2024-03-18 01:22:39.460320 │ 2024-12-25 │
└──────────────────────────────────────────────────────────────────────────────────────────┘
# Returns NULL because the conversion results in an error
func.error_or(func.to_date('2024-1234'))
┌────────────────────────────────────────────┐
│ func.error_or(func.to_date('2024-1234')) │
├────────────────────────────────────────────┤
│ NULL │
└────────────────────────────────────────────┘
SQL Syntax
ERROR_OR(expr1, expr2, ...)
SQL Examples
-- Returns the valid date if no errors occur
-- Returns the current date if the conversion results in an error
SELECT NOW(), ERROR_OR('2024-12-25'::DATE, NOW()::DATE);
┌────────────────────────────────────────────────────────────────────────┐
│ now() │ error_or('2024-12-25'::date, now()::date) │
├────────────────────────────┼───────────────────────────────────────────┤
│ 2024-03-18 01:22:39.460320 │ 2024-12-25 │
└────────────────────────────────────────────────────────────────────────┘
-- Returns NULL because the conversion results in an error
SELECT ERROR_OR('2024-1234'::DATE);
┌─────────────────────────────┐
│ error_or('2024-1234'::date) │
├─────────────────────────────┤
│ NULL │
└─────────────────────────────┘
5.8 - GREATEST
Returns the maximum value from a set of values.
Analyze Syntax
func.greatest(<value1>, <value2> ...)
Analyze Examples
func.greatest((5, 9, 4))
┌──────────────────────────┐
│ func.greatest((5, 9, 4)) │
├──────────────────────────┤
│ 9 │
└──────────────────────────┘
SQL Syntax
GREATEST(<value1>, <value2> ...)
SQL Examples
SELECT GREATEST(5, 9, 4);
┌───────────────────┐
│ greatest(5, 9, 4) │
├───────────────────┤
│ 9 │
└───────────────────┘
5.9 - IF
If <cond1>
is TRUE, it returns <expr1>
. Otherwise if <cond2>
is TRUE, it returns <expr2>
, and so on.
Analyze Syntax
func.if(<cond1>, <expr1>, [<cond2>, <expr2> ...], <expr_else>)
Analyze Examples
func.if((1 > 2), 3, (4 < 5), 6, 7)
┌────────────────────────────────────┐
│ func.if((1 > 2), 3, (4 < 5), 6, 7) │
├────────────────────────────────────┤
│ 6 │
└────────────────────────────────────┘
SQL Syntax
IF(<cond1>, <expr1>, [<cond2>, <expr2> ...], <expr_else>)
SQL Examples
SELECT IF(1 > 2, 3, 4 < 5, 6, 7);
┌───────────────────────────────┐
│ if((1 > 2), 3, (4 < 5), 6, 7) │
├───────────────────────────────┤
│ 6 │
└───────────────────────────────┘
5.10 - IFNULL
If <expr1>
is NULL, returns <expr2>
, otherwise returns <expr1>
.
Analyze Syntax
func.ifnull(<expr1>, <expr2>)
Analyze Examples
func.ifnull(null, 'b'), func.ifnull('a', 'b')
┌────────────────────────────────────────────────┐
│ func.ifnull(null, 'b') │ func.ifnull('a', 'b') │
├────────────────────────┼───────────────────────┤
│ b │ a │
└────────────────────────────────────────────────┘
func.ifnull(null, 2), func.ifnull(1, 2)
┌──────────────────────────────────────────┐
│ func.ifnull(null, 2) │ func.ifnull(1, 2) │
├──────────────────────┼───────────────────┤
│ 2 │ 1 │
└──────────────────────────────────────────┘
SQL Syntax
Aliases
SQL Examples
SELECT IFNULL(NULL, 'b'), IFNULL('a', 'b');
┌──────────────────────────────────────┐
│ ifnull(null, 'b') │ ifnull('a', 'b') │
├───────────────────┼──────────────────┤
│ b │ a │
└──────────────────────────────────────┘
SELECT IFNULL(NULL, 2), IFNULL(1, 2);
┌────────────────────────────────┐
│ ifnull(null, 2) │ ifnull(1, 2) │
├─────────────────┼──────────────┤
│ 2 │ 1 │
└────────────────────────────────┘
5.11 - IS [ NOT ] DISTINCT FROM
Compares whether two expressions are equal (or not equal) with awareness of nullability, meaning it treats NULLs as known values for comparing equality.
SQL Syntax
<expr1> IS [ NOT ] DISTINCT FROM <expr2>
SQL Examples
SELECT NULL IS DISTINCT FROM NULL;
┌────────────────────────────┐
│ null is distinct from null │
├────────────────────────────┤
│ false │
└────────────────────────────┘
5.12 - IS_ERROR
Returns a Boolean value indicating whether an expression is an error value.
See also: IS_NOT_ERROR
Analyze Syntax
Analyze Examples
# Indicates division by zero, hence an error
func.is_error((1 / 0)), func.is_not_error((1 / 0))
┌─────────────────────────────────────────────────────┐
│ func.is_error((1 / 0)) │ func.is_not_error((1 / 0)) │
├────────────────────────┼────────────────────────────┤
│ true │ false │
└─────────────────────────────────────────────────────┘
# The conversion to DATE is successful, hence not an error
func.is_error(func.to_date('2024-03-17')), func.is_not_error(func.to_date('2024-03-17'))
┌───────────────────────────────────────────────────────────────────────────────────────────┐
│ func.is_error(func.to_date('2024-03-17')) │ func.is_not_error(func.to_date('2024-03-17')) │
├───────────────────────────────────────────┼───────────────────────────────────────────────┤
│ false │ true │
└───────────────────────────────────────────────────────────────────────────────────────────┘
SQL Syntax
Return Type
Returns true
if the expression is an error, otherwise false
.
SQL Examples
-- Indicates division by zero, hence an error
SELECT IS_ERROR(1/0), IS_NOT_ERROR(1/0);
┌───────────────────────────────────────────┐
│ is_error((1 / 0)) │ is_not_error((1 / 0)) │
├───────────────────┼───────────────────────┤
│ true │ false │
└───────────────────────────────────────────┘
-- The conversion to DATE is successful, hence not an error
SELECT IS_ERROR('2024-03-17'::DATE), IS_NOT_ERROR('2024-03-17'::DATE);
┌─────────────────────────────────────────────────────────────────┐
│ is_error('2024-03-17'::date) │ is_not_error('2024-03-17'::date) │
├──────────────────────────────┼──────────────────────────────────┤
│ false │ true │
└─────────────────────────────────────────────────────────────────┘
5.13 - IS_NOT_ERROR
Returns a Boolean value indicating whether an expression is an error value.
See also: IS_ERROR
Analyze Syntax
Analyze Examples
# Indicates division by zero, hence an error
func.is_error((1 / 0)), func.is_not_error((1 / 0))
┌─────────────────────────────────────────────────────┐
│ func.is_error((1 / 0)) │ func.is_not_error((1 / 0)) │
├────────────────────────┼────────────────────────────┤
│ true │ false │
└─────────────────────────────────────────────────────┘
# The conversion to DATE is successful, hence not an error
func.is_error(func.to_date('2024-03-17')), func.is_not_error(func.to_date('2024-03-17'))
┌───────────────────────────────────────────────────────────────────────────────────────────┐
│ func.is_error(func.to_date('2024-03-17')) │ func.is_not_error(func.to_date('2024-03-17')) │
├───────────────────────────────────────────┼───────────────────────────────────────────────┤
│ false │ true │
└───────────────────────────────────────────────────────────────────────────────────────────┘
SQL Syntax
Return Type
Returns true
if the expression is not an error, otherwise false
.
SQL Examples
-- Indicates division by zero, hence an error
SELECT IS_ERROR(1/0), IS_NOT_ERROR(1/0);
┌───────────────────────────────────────────┐
│ is_error((1 / 0)) │ is_not_error((1 / 0)) │
├───────────────────┼───────────────────────┤
│ true │ false │
└───────────────────────────────────────────┘
-- The conversion to DATE is successful, hence not an error
SELECT IS_ERROR('2024-03-17'::DATE), IS_NOT_ERROR('2024-03-17'::DATE);
┌─────────────────────────────────────────────────────────────────┐
│ is_error('2024-03-17'::date) │ is_not_error('2024-03-17'::date) │
├──────────────────────────────┼──────────────────────────────────┤
│ false │ true │
└─────────────────────────────────────────────────────────────────┘
5.14 - IS_NOT_NULL
Checks whether a value is not NULL.
Analyze Syntax
Analyze Examples
func.is_not_null(1)
┌─────────────────────┐
│ func.is_not_null(1) │
├─────────────────────┤
│ true │
└─────────────────────┘
SQL Syntax
SQL Examples
SELECT IS_NOT_NULL(1);
┌────────────────┐
│ is_not_null(1) │
├────────────────┤
│ true │
└────────────────┘
5.15 - IS_NULL
Checks whether a value is NULL.
Analyze Syntax
Analyze Examples
func.is_null(1)
┌─────────────────┐
│ func.is_null(1) │
├─────────────────┤
│ false │
└─────────────────┘
SQL Syntax
SQL Examples
SELECT IS_NULL(1);
┌────────────┐
│ is_null(1) │
├────────────┤
│ false │
└────────────┘
5.16 - LEAST
Returns the minimum value from a set of values.
Analyze Syntax
func.least((<value1>, <value2> ...))
Analyze Examples
func.least((5, 9, 4))
┌───────────────────────┐
│ func.least((5, 9, 4)) │
├───────────────────────┤
│ 4 │
└───────────────────────┘
SQL Syntax
LEAST(<value1>, <value2> ...)
SQL Examples
SELECT LEAST(5, 9, 4);
┌────────────────┐
│ least(5, 9, 4) │
├────────────────┤
│ 4 │
└────────────────┘
5.17 - NULLIF
Returns NULL if two expressions are equal. Otherwise return expr1. They must have the same data type.
Analyze Syntax
func.nullif(<expr1>, <expr2>)
Analyze Examples
func.nullif(0, null)
┌──────────────────────┐
│ func.nullif(0, null) │
├──────────────────────┤
│ 0 │
└──────────────────────┘
SQL Syntax
SQL Examples
SELECT NULLIF(0, NULL);
┌─────────────────┐
│ nullif(0, null) │
├─────────────────┤
│ 0 │
└─────────────────┘
5.18 - NVL
If <expr1>
is NULL, returns <expr2>
, otherwise returns <expr1>
.
Analyze Syntax
func.nvl(<expr1>, <expr2>)
Analyze Examples
func.nvl(null, 'b'), func.nvl('a', 'b')
┌──────────────────────────────────────────┐
│ func.nvl(null, 'b') │ func.nvl('a', 'b') │
├─────────────────────┼────────────────────┤
│ b │ a │
└──────────────────────────────────────────┘
func.nvl(null, 2), func.nvl(1, 2)
┌────────────────────────────────────┐
│ func.nvl(null, 2) │ func.nvl(1, 2) │
├───────────────────┼────────────────┤
│ 2 │ 1 │
└────────────────────────────────────┘
SQL Syntax
Aliases
SQL Examples
SELECT NVL(NULL, 'b'), NVL('a', 'b');
┌────────────────────────────────┐
│ nvl(null, 'b') │ nvl('a', 'b') │
├────────────────┼───────────────┤
│ b │ a │
└────────────────────────────────┘
SELECT NVL(NULL, 2), NVL(1, 2);
┌──────────────────────────┐
│ nvl(null, 2) │ nvl(1, 2) │
├──────────────┼───────────┤
│ 2 │ 1 │
└──────────────────────────┘
5.19 - NVL2
Returns <expr2>
if <expr1>
is not NULL; otherwise, it returns <expr3>
.
Analyze Syntax
func.nvl2(<expr1> , <expr2> , <expr3>)
Analyze Examples
func.nvl2('a', 'b', 'c'), func.nvl2(null, 'b', 'c')
┌──────────────────────────────────────────────────────┐
│ func.nvl2('a', 'b', 'c') │ func.nvl2(null, 'b', 'c') │
├──────────────────────────┼───────────────────────────┤
│ b │ c │
└──────────────────────────────────────────────────────┘
func.nvl2(1, 2, 3), func.nvl2(null, 2, 3)
┌────────────────────────────────────────────┐
│ func.nvl2(1, 2, 3) │ func.nvl2(null, 2, 3) │
├────────────────────┼───────────────────────┤
│ 2 │ 3 │
└────────────────────────────────────────────┘
SQL Syntax
NVL2(<expr1> , <expr2> , <expr3>)
SQL Examples
SELECT NVL2('a', 'b', 'c'), NVL2(NULL, 'b', 'c');
┌────────────────────────────────────────────┐
│ nvl2('a', 'b', 'c') │ nvl2(null, 'b', 'c') │
├─────────────────────┼──────────────────────┤
│ b │ c │
└────────────────────────────────────────────┘
SELECT NVL2(1, 2, 3), NVL2(NULL, 2, 3);
┌──────────────────────────────────┐
│ nvl2(1, 2, 3) │ nvl2(null, 2, 3) │
├───────────────┼──────────────────┤
│ 2 │ 3 │
└──────────────────────────────────┘
5.20 - OR
Conditional OR operator. Checks whether either condition is true.
Analyze Syntax
or_(<expr1>[, <expr2> ...])
Analyze Examples
or_(
table.color == 'green',
table.shape == 'circle',
table.price >= 1.25
)
SQL Syntax
SQL Examples
SELECT * FROM table WHERE
table.color = 'green'
OR table.shape = 'circle'
OR table.price >= 1.25;
6 - Context Functions
This section provides reference information for the context-related functions in PlaidCloud Lakehouse.
6.1 - CONNECTION_ID
Returns the connection ID for the current connection.
Analyze Syntax
Analyze Examples
func.connection_id()
┌──────────────────────────────────────┐
│ func.connection_id() │
├──────────────────────────────────────┤
│ 23cb06ec-583e-4eba-b790-7c8cf72a53f8 │
└──────────────────────────────────────┘
SQL Syntax
SQL Examples
SELECT CONNECTION_ID();
┌──────────────────────────────────────┐
│ connection_id() │
├──────────────────────────────────────┤
│ 23cb06ec-583e-4eba-b790-7c8cf72a53f8 │
└──────────────────────────────────────┘
6.2 - CURRENT_USER
Returns the user name and host name combination for the account that the server used to authenticate the current client. This account determines your access privileges. The return value is a string in the utf8 character set.
Analyze Syntax
Analyze Examples
func.current_user()
┌─────────────────────┐
│ func.current_user() │
├─────────────────────┤
│ 'root'@'%' │
└─────────────────────┘
SQL Syntax
SQL Examples
SELECT CURRENT_USER();
┌────────────────┐
│ current_user() │
├────────────────┤
│ 'root'@'%' │
└────────────────┘
6.3 - DATABASE
Returns the name of the currently selected database. If no database is selected, then this function returns default
.
Analyze Syntax
Analyze Examples
func.database()
┌─────────────────┐
│ func.database() │
├─────────────────┤
│ default │
└─────────────────┘
SQL Syntax
SQL Examples
SELECT DATABASE();
┌────────────┐
│ database() │
├────────────┤
│ default │
└────────────┘
6.4 - LAST_QUERY_ID
Returns the last query ID of query in current session, index can be (-1, 1, 1+2)..., out of range index will return empty string.
Analyze Syntax
func.last_query_id(<index>)
Analyze Examples
func.last_query_id(-1)
┌──────────────────────────────────────┐
│ func.last_query_id((- 1)) │
├──────────────────────────────────────┤
│ a6f615c6-5bad-4863-8558-afd01889448c │
└──────────────────────────────────────┘
SQL Syntax
SQL Examples
SELECT LAST_QUERY_ID(-1);
┌──────────────────────────────────────┐
│ last_query_id((- 1)) │
├──────────────────────────────────────┤
│ a6f615c6-5bad-4863-8558-afd01889448c │
└──────────────────────────────────────┘
6.5 - VERSION
Returns the current version of PlaidCloud LakehouseQuery.
Analyze Syntax
Analyze Examples
func.version()
┌───────────────────────────────────────────────────────────────────────────────────────────────────────────┐
│ func.version() │
├───────────────────────────────────────────────────────────────────────────────────────────────────────────┤
│ PlaidCloud LakehouseQuery v1.2.252-nightly-193ed56304(rust-1.75.0-nightly-2023-12-12T22:07:25.371440000Z) │
└───────────────────────────────────────────────────────────────────────────────────────────────────────────┘
SQL Syntax
SQL Examples
SELECT VERSION();
┌───────────────────────────────────────────────────────────────────────────────────────────────────────────┐
│ version() │
├───────────────────────────────────────────────────────────────────────────────────────────────────────────┤
│ PlaidCloud LakehouseQuery v1.2.252-nightly-193ed56304(rust-1.75.0-nightly-2023-12-12T22:07:25.371440000Z) │
└───────────────────────────────────────────────────────────────────────────────────────────────────────────┘
7 - Conversion Functions
This section provides reference information for the conversion functions in PlaidCloud Lakehouse.
Please note the following when converting a value from one type to another:
When converting from floating-point, decimal numbers, or strings to integers or decimal numbers with fractional parts, PlaidCloud Lakehouse rounds the values to the nearest integer. This is determined by the setting numeric_cast_option
(defaults to 'rounding') which controls the behavior of numeric casting operations. When numeric_cast_option
is explicitly set to 'truncating', PlaidCloud Lakehouse will truncate the decimal part, discarding any fractional values.
SELECT CAST('0.6' AS DECIMAL(10, 0)), CAST(0.6 AS DECIMAL(10, 0)), CAST(1.5 AS INT);
┌──────────────────────────────────────────────────────────────────────────────────┐
│ cast('0.6' as decimal(10, 0)) │ cast(0.6 as decimal(10, 0)) │ cast(1.5 as int32) │
├───────────────────────────────┼─────────────────────────────┼────────────────────┤
│ 1 │ 1 │ 2 │
└──────────────────────────────────────────────────────────────────────────────────┘
SET numeric_cast_option = 'truncating';
SELECT CAST('0.6' AS DECIMAL(10, 0)), CAST(0.6 AS DECIMAL(10, 0)), CAST(1.5 AS INT);
┌──────────────────────────────────────────────────────────────────────────────────┐
│ cast('0.6' as decimal(10, 0)) │ cast(0.6 as decimal(10, 0)) │ cast(1.5 as int32) │
├───────────────────────────────┼─────────────────────────────┼────────────────────┤
│ 0 │ 0 │ 1 │
└──────────────────────────────────────────────────────────────────────────────────┘
The table below presents a summary of numeric casting operations, highlighting the casting possibilities between different source and target numeric data types. Please note that, it specifies the requirement for String to Integer casting, where the source string must contain an integer value.
Source Type | Target Type |
---|
String | Decimal |
Float | Decimal |
Decimal | Decimal |
Float | Int |
Decimal | Int |
String (Int) | Int |
PlaidCloud Lakehouse also offers a variety of functions for converting expressions into different date and time formats. For more information, see Date & Time Functions.
7.1 - BUILD_BITMAP
Converts an array of positive integers to a BITMAP value.
Analyze Syntax
func.build_bitmap( <expr> )
Analyze Examples
func.to_string(func.build_bitmap([1, 4, 5]))
┌───────────────────────────────────────────────┐
│ func.to_string(func.build_bitmap([1, 4, 5])) │
├───────────────────────────────────────────────┤
│ 1,4,5 │
└───────────────────────────────────────────────┘
SQL Syntax
SQL Examples
SELECT BUILD_BITMAP([1,4,5])::String;
┌─────────────────────────────────┐
│ build_bitmap([1, 4, 5])::string │
├─────────────────────────────────┤
│ 1,4,5 │
└─────────────────────────────────┘
7.2 - CAST, ::
Converts a value from one data type to another. ::
is an alias for CAST.
See also: TRY_CAST
Analyze Syntax
func.cast( <expr>, <data_type> )
Analyze Examples
func.cast(1, string), func.to_string(1)
┌───────────────────────────────────────────┐
│ func.cast(1, string) │ func.to_string(1) │
├──────────────────────┼────────────────────┤
│ 1 │ 1 │
└───────────────────────────────────────────┘
SQL Syntax
CAST( <expr> AS <data_type> )
<expr>::<data_type>
SQL Examples
SELECT CAST(1 AS VARCHAR), 1::VARCHAR;
┌───────────────────────────────┐
│ cast(1 as string) │ 1::string │
├───────────────────┼───────────┤
│ 1 │ 1 │
└───────────────────────────────┘
7.3 - TO_BITMAP
Converts a value to BITMAP data type.
Analyze Syntax
Analyze Examples
func.to_bitmap('1101')
┌─────────────────────────┐
│ func.to_bitmap('1101') │
├─────────────────────────┤
│ <bitmap binary> │
└─────────────────────────┘
SQL Syntax
SQL Examples
SELECT TO_BITMAP('1101');
┌───────────────────┐
│ to_bitmap('1101') │
├───────────────────┤
│ <bitmap binary> │
└───────────────────┘
7.4 - TO_BOOLEAN
Converts a value to BOOLEAN data type.
Analyze Syntax
func.to_boolean( <expr> )
Analyze Examples
func.to_boolean('true')
┌──────────────────────────┐
│ func.to_boolean('true') │
├──────────────────────────┤
│ true │
└──────────────────────────┘
SQL Syntax
SQL Examples
SELECT TO_BOOLEAN('true');
┌────────────────────┐
│ to_boolean('true') │
├────────────────────┤
│ true │
└────────────────────┘
7.5 - TO_FLOAT32
Converts a value to FLOAT32 data type.
Analyze Syntax
func.to_float32( <expr> )
Analyze Examples
func.to_float32('1.2')
┌─────────────────────────┐
│ func.to_float32('1.2') │
├─────────────────────────┤
│ 1.2 │
└─────────────────────────┘
SQL Syntax
SQL Examples
SELECT TO_FLOAT32('1.2');
┌───────────────────┐
│ to_float32('1.2') │
├───────────────────┤
│ 1.2 │
└───────────────────┘
7.6 - TO_FLOAT64
Converts a value to FLOAT64 data type.
Analyze Syntax
func.to_float64( <expr> )
Analyze Examples
func.to_float64('1.2')
┌─────────────────────────┐
│ func.to_float64('1.2') │
├─────────────────────────┤
│ 1.2 │
└─────────────────────────┘
SQL Syntax
SQL Examples
SELECT TO_FLOAT64('1.2');
┌───────────────────┐
│ to_float64('1.2') │
├───────────────────┤
│ 1.2 │
└───────────────────┘
7.7 - TO_HEX
For a string argument str, TO_HEX() returns a hexadecimal string representation of str where each byte of each character in str is converted to two hexadecimal digits. The inverse of this operation is performed by the UNHEX() function.
For a numeric argument N, TO_HEX() returns a hexadecimal string representation of the value of N treated as a longlong (BIGINT) number.
Analyze Syntax
Analyze Examples
func.to_hex('abc')
┌────────────────────┐
│ func.to_hex('abc') │
├────────────────────┤
│ 616263 │
└────────────────────┘
SQL Syntax
Aliases
SQL Examples
SELECT HEX('abc'), TO_HEX('abc');
┌────────────────────────────┐
│ hex('abc') │ to_hex('abc') │
├────────────┼───────────────┤
│ 616263 │ 616263 │
└────────────────────────────┘
SELECT HEX(255), TO_HEX(255);
┌────────────────────────┐
│ hex(255) │ to_hex(255) │
├──────────┼─────────────┤
│ ff │ ff │
└────────────────────────┘
7.8 - TO_INT16
Converts a value to INT16 data type.
Analyze Syntax
Analyze Examples
func.to_int16('123')
┌──────────────────────┐
│ func.to_int16('123') │
├──────────────────────┤
│ 123 │
└──────────────────────┘
SQL Syntax
SQL Examples
SELECT TO_INT16('123');
┌─────────────────┐
│ to_int16('123') │
├─────────────────┤
│ 123 │
└─────────────────┘
7.9 - TO_INT32
Converts a value to INT32 data type.
Analyze Syntax
Analyze Examples
func.to_int32('123')
┌──────────────────────┐
│ func.to_int32('123') │
├──────────────────────┤
│ 123 │
└──────────────────────┘
SQL Syntax
SQL Examples
SELECT TO_INT32('123');
┌─────────────────┐
│ to_int32('123') │
├─────────────────┤
│ 123 │
└─────────────────┘
7.10 - TO_INT64
Converts a value to INT64 data type.
Analyze Syntax
Analyze Examples
func.to_int64('123')
┌──────────────────────┐
│ func.to_int64('123') │
├──────────────────────┤
│ 123 │
└──────────────────────┘
SQL Syntax
SQL Examples
SELECT TO_INT64('123');
┌─────────────────┐
│ to_int64('123') │
├─────────────────┤
│ 123 │
└─────────────────┘
7.11 - TO_INT8
Converts a value to INT8 data type.
Analyze Syntax
Analyze Examples
func.to_int8('123')
┌─────────────────────┐
│ func.to_int8('123') │
├─────────────────────┤
│ 123 │
└─────────────────────┘
SQL Syntax
SQL Examples
SELECT TO_INT8('123');
┌────────────────┐
│ to_int8('123') │
│ UInt8 │
├────────────────┤
│ 123 │
└────────────────┘
7.12 - TO_STRING
Converts a value to String data type, or converts a Date value to a specific string format. To customize the format of date and time in PlaidCloud Lakehouse, you can utilize specifiers. These specifiers allow you to define the desired format for date and time values. For a comprehensive list of supported specifiers, see Formatting Date and Time.
Analyze Syntax
func.to_string( '<expr>' )
Analyze Examples
func.date_format('1.23'), func.to_string('1.23'), func.to_text('1.23'), func.to_varchar('1.23'), func.json_to_string('1.23')
┌──────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────┐
│ func.date_format('1.23') │ func.to_string('1.23') │ func.to_text('1.23') │ func.to_varchar('1.23') │ func.json_to_string('1.23') │
├──────────────────────────┼────────────────────────┼──────────────────────┼─────────────────────────┼─────────────────────────────┤
│ 1.23 │ 1.23 │ 1.23 │ 1.23 │ 1.23 │
└──────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────┘
SQL Syntax
TO_STRING( '<expr>' )
TO_STRING( '<date>', '<format>' )
Aliases
Return Type
String.
SQL Examples
SELECT
DATE_FORMAT('1.23'),
TO_STRING('1.23'),
TO_TEXT('1.23'),
TO_VARCHAR('1.23'),
JSON_TO_STRING('1.23');
┌─────────────────────────────────────────────────────────────────────────────────────────────────────────┐
│ date_format('1.23') │ to_string('1.23') │ to_text('1.23') │ to_varchar('1.23') │ json_to_string('1.23') │
├─────────────────────┼───────────────────┼─────────────────┼────────────────────┼────────────────────────┤
│ 1.23 │ 1.23 │ 1.23 │ 1.23 │ 1.23 │
└─────────────────────────────────────────────────────────────────────────────────────────────────────────┘
SELECT
DATE_FORMAT('["Cooking", "Reading"]' :: JSON),
TO_STRING('["Cooking", "Reading"]' :: JSON),
TO_TEXT('["Cooking", "Reading"]' :: JSON),
TO_VARCHAR('["Cooking", "Reading"]' :: JSON),
JSON_TO_STRING('["Cooking", "Reading"]' :: JSON);
┌────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────┐
│ date_format('["cooking", "reading"]'::variant) │ to_string('["cooking", "reading"]'::variant) │ to_text('["cooking", "reading"]'::variant) │ to_varchar('["cooking", "reading"]'::variant) │ json_to_string('["cooking", "reading"]'::variant) │
├────────────────────────────────────────────────┼──────────────────────────────────────────────┼────────────────────────────────────────────┼───────────────────────────────────────────────┼───────────────────────────────────────────────────┤
│ ["Cooking","Reading"] │ ["Cooking","Reading"] │ ["Cooking","Reading"] │ ["Cooking","Reading"] │ ["Cooking","Reading"] │
└────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────┘
-- With one argument, the function converts input to a string without validating as a date.
SELECT
DATE_FORMAT('20223-12-25'),
TO_STRING('20223-12-25'),
TO_TEXT('20223-12-25'),
TO_VARCHAR('20223-12-25'),
JSON_TO_STRING('20223-12-25');
┌────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────┐
│ date_format('20223-12-25') │ to_string('20223-12-25') │ to_text('20223-12-25') │ to_varchar('20223-12-25') │ json_to_string('20223-12-25') │
├────────────────────────────┼──────────────────────────┼────────────────────────┼───────────────────────────┼───────────────────────────────┤
│ 20223-12-25 │ 20223-12-25 │ 20223-12-25 │ 20223-12-25 │ 20223-12-25 │
└────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────┘
SELECT
DATE_FORMAT('2022-12-25', '%m/%d/%Y'),
TO_STRING('2022-12-25', '%m/%d/%Y'),
TO_TEXT('2022-12-25', '%m/%d/%Y'),
TO_VARCHAR('2022-12-25', '%m/%d/%Y'),
JSON_TO_STRING('2022-12-25', '%m/%d/%Y');
┌───────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────┐
│ date_format('2022-12-25', '%m/%d/%y') │ to_string('2022-12-25', '%m/%d/%y') │ to_text('2022-12-25', '%m/%d/%y') │ to_varchar('2022-12-25', '%m/%d/%y') │ json_to_string('2022-12-25', '%m/%d/%y') │
├───────────────────────────────────────┼─────────────────────────────────────┼───────────────────────────────────┼──────────────────────────────────────┼──────────────────────────────────────────┤
│ 12/25/2022 │ 12/25/2022 │ 12/25/2022 │ 12/25/2022 │ 12/25/2022 │
└───────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────┘
7.13 - TO_TEXT
Alias for TO_STRING.
7.14 - TO_UINT16
Converts a value to UINT16 data type.
Analyze Syntax
Analyze Examples
func.to_uint16('123')
┌───────────────────────┐
│ func.to_uint16('123') │
├───────────────────────┤
│ 123 │
└───────────────────────┘
SQL Syntax
SQL Examples
SELECT TO_UINT16('123');
┌──────────────────┐
│ to_uint16('123') │
├──────────────────┤
│ 123 │
└──────────────────┘
7.15 - TO_UINT32
Converts a value to UINT32 data type.
Analyze Syntax
Analyze Examples
func.to_uint32('123')
┌───────────────────────┐
│ func.to_uint32('123') │
├───────────────────────┤
│ 123 │
└───────────────────────┘
SQL Syntax
SQL Examples
SELECT TO_UINT32('123');
┌──────────────────┐
│ to_uint32('123') │
├──────────────────┤
│ 123 │
└──────────────────┘
7.16 - TO_UINT64
Converts a value to UINT64 data type.
Analyze Syntax
Analyze Examples
func.to_uint64('123')
┌───────────────────────┐
│ func.to_uint64('123') │
├───────────────────────┤
│ 123 │
└───────────────────────┘
SQL Syntax
SQL Examples
SELECT TO_UINT64('123');
┌──────────────────┐
│ to_uint64('123') │
├──────────────────┤
│ 123 │
└──────────────────┘
7.17 - TO_UINT8
Converts a value to UINT8 data type.
Analyze Syntax
Analyze Examples
func.to_uint8('123')
┌──────────────────────┐
│ func.to_uint8('123') │
├──────────────────────┤
│ 123 │
└──────────────────────┘
SQL Syntax
SQL Examples
SELECT TO_UINT8('123');
┌─────────────────┐
│ to_uint8('123') │
├─────────────────┤
│ 123 │
└─────────────────┘
7.18 - TO_VARCHAR
Alias for TO_STRING.
7.19 - TO_VARIANT
Converts a value to VARIANT data type.
Analyze Syntax
func.to_variant( <expr> )
Analyze Examples
func.to_variant(to_bitmap('100,200,300'))
┌───────────────────────────────────────────┐
│ func.to_variant(to_bitmap('100,200,300')) │
├───────────────────────────────────────────┤
│ [100,200,300] │
└───────────────────────────────────────────┘
SQL Syntax
SQL Examples
SELECT TO_VARIANT(TO_BITMAP('100,200,300'));
┌──────────────────────────────────────┐
│ to_variant(to_bitmap('100,200,300')) │
├──────────────────────────────────────┤
│ [100,200,300] │
└──────────────────────────────────────┘
7.20 - TRY_CAST
Converts a value from one data type to another. Returns NULL on error.
See also: CAST
Analyze Syntax
func.try_cast( <expr>, <data_type> )
Analyze Examples
func.try_cast(1, string)
┌──────────────────────────┐
│ func.try_cast(1, string) │
├──────────────────────────┤
│ 1 │
└──────────────────────────┘
SQL Syntax
TRY_CAST( <expr> AS <data_type> )
SQL Examples
SELECT TRY_CAST(1 AS VARCHAR);
┌───────────────────────┐
│ try_cast(1 as string) │
├───────────────────────┤
│ 1 │
└───────────────────────┘
8 - Date & Time Functions
This section provides reference information for the datetime-related functions in PlaidCloud Lakehouse.
Conversion Functions
Date Arithmetic Functions
Others
8.1 - ADD TIME INTERVAL
Add time interval function
Add a time interval to a date or timestamp, return the result of date or timestamp type.
Analyze Syntax
func.add_years(<exp0>, <expr1>)
func.add_quarters(<exp0>, <expr1>)
func.add_months(<exp0>, <expr1>)
func.add_days(<exp0>, <expr1>)
func.add_hours(<exp0>, <expr1>)
func.add_minutes(<exp0>, <expr1>)
func.add_seconds(<exp0>, <expr1>)
Analyze Examples
func.to_date(18875), func.add_years(func.to_date(18875), 2)
+---------------------------------+---------------------------------------------------+
| func.to_date(18875) | func.add_years(func.to_date(18875), 2) |
+---------------------------------+---------------------------------------------------+
| 2021-09-05 | 2023-09-05 |
+---------------------------------+---------------------------------------------------+
func.to_date(18875), func.add_quarters(func.to_date(18875), 2)
+---------------------------------+---------------------------------------------------+
| func.to_date(18875) | add_quarters(func.to_date(18875), 2) |
+---------------------------------+---------------------------------------------------+
| 2021-09-05 | 2022-03-05 |
+---------------------------------+---------------------------------------------------+
func.to_date(18875), func.add_months(func.to_date(18875), 2)
+---------------------------------+---------------------------------------------------+
| func.to_date(18875) | func.add_months(func.to_date(18875), 2) |
+---------------------------------+---------------------------------------------------+
| 2021-09-05 | 2021-11-05 |
+---------------------------------+---------------------------------------------------+
func.to_date(18875), func.add_days(func.to_date(18875), 2)
+---------------------------------+---------------------------------------------------+
| func.to_date(18875) | func.add_days(func.to_date(18875), 2) |
+---------------------------------+---------------------------------------------------+
| 2021-09-05 | 2021-09-07 |
+---------------------------------+---------------------------------------------------+
func.to_datetime(1630833797), func.add_hours(func.to_datetime(1630833797), 2)
+---------------------------------+---------------------------------------------------+
| func.to_datetime(1630833797) | func.add_hours(func.to_datetime(1630833797), 2) |
+---------------------------------+---------------------------------------------------+
| 2021-09-05 09:23:17.000000 | 2021-09-05 11:23:17.000000 |
+---------------------------------+---------------------------------------------------+
func.to_datetime(1630833797), func.add_minutes(func.to_datetime(1630833797), 2)
+---------------------------------+---------------------------------------------------+
| func.to_datetime(1630833797) | func.add_minutes(func.to_datetime(1630833797), 2) |
+---------------------------------+---------------------------------------------------+
| 2021-09-05 09:23:17.000000 | 2021-09-05 09:25:17.000000 |
+---------------------------------+---------------------------------------------------+
func.to_datetime(1630833797), func.add_seconds(func.to_datetime(1630833797), 2)
+---------------------------------+---------------------------------------------------+
| func.to_datetime(1630833797) | func.add_seconds(func.to_datetime(1630833797), 2) |
+---------------------------------+---------------------------------------------------+
| 2021-09-05 09:23:17.000000 | 2021-09-05 09:23:19.000000 |
+---------------------------------+---------------------------------------------------+
SQL Syntax
ADD_YEARS(<exp0>, <expr1>)
ADD_QUARTERs(<exp0>, <expr1>)
ADD_MONTHS(<exp0>, <expr1>)
ADD_DAYS(<exp0>, <expr1>)
ADD_HOURS(<exp0>, <expr1>)
ADD_MINUTES(<exp0>, <expr1>)
ADD_SECONDS(<exp0>, <expr1>)
Return Type
DATE
, TIMESTAMP
, depends on the input.
SQL Examples
SELECT to_date(18875), add_years(to_date(18875), 2);
+----------------+------------------------------+
| to_date(18875) | add_years(to_date(18875), 2) |
+----------------+------------------------------+
| 2021-09-05 | 2023-09-05 |
+----------------+------------------------------+
SELECT to_date(18875), add_quarters(to_date(18875), 2);
+----------------+---------------------------------+
| to_date(18875) | add_quarters(to_date(18875), 2) |
+----------------+---------------------------------+
| 2021-09-05 | 2022-03-05 |
+----------------+---------------------------------+
SELECT to_date(18875), add_months(to_date(18875), 2);
+----------------+-------------------------------+
| to_date(18875) | add_months(to_date(18875), 2) |
+----------------+-------------------------------+
| 2021-09-05 | 2021-11-05 |
+----------------+-------------------------------+
SELECT to_date(18875), add_days(to_date(18875), 2);
+----------------+-----------------------------+
| to_date(18875) | add_days(to_date(18875), 2) |
+----------------+-----------------------------+
| 2021-09-05 | 2021-09-07 |
+----------------+-----------------------------+
SELECT to_datetime(1630833797), add_hours(to_datetime(1630833797), 2);
+----------------------------+---------------------------------------+
| to_datetime(1630833797) | add_hours(to_datetime(1630833797), 2) |
+----------------------------+---------------------------------------+
| 2021-09-05 09:23:17.000000 | 2021-09-05 11:23:17.000000 |
+----------------------------+---------------------------------------+
SELECT to_datetime(1630833797), add_minutes(to_datetime(1630833797), 2);
+----------------------------+-----------------------------------------+
| to_datetime(1630833797) | add_minutes(to_datetime(1630833797), 2) |
+----------------------------+-----------------------------------------+
| 2021-09-05 09:23:17.000000 | 2021-09-05 09:25:17.000000 |
+----------------------------+-----------------------------------------+
SELECT to_datetime(1630833797), add_seconds(to_datetime(1630833797), 2);
+----------------------------+-----------------------------------------+
| to_datetime(1630833797) | add_seconds(to_datetime(1630833797), 2) |
+----------------------------+-----------------------------------------+
| 2021-09-05 09:23:17.000000 | 2021-09-05 09:23:19.000000 |
+----------------------------+-----------------------------------------+
8.2 - CURRENT_TIMESTAMP
Alias for NOW.
8.3 - DATE
Alias for TO_DATE.
8.4 - DATE DIFF
PlaidCloud Lakehouse does not provide a date_diff
function yet, but it supports direct arithmetic operations on dates and times. For example, you can use the expression TO_DATE(NOW())-2
to obtain the date from two days ago.
This flexibility of directly manipulating dates and times in PlaidCloud Lakehouse makes it convenient and versatile for handling date and time computations. See an example below:
CREATE TABLE tasks (
task_name VARCHAR(50),
start_date DATE,
end_date DATE
);
INSERT INTO tasks (task_name, start_date, end_date)
VALUES
('Task 1', '2023-06-15', '2023-06-20'),
('Task 2', '2023-06-18', '2023-06-25'),
('Task 3', '2023-06-20', '2023-06-23');
SELECT task_name, end_date - start_date AS duration
FROM tasks;
task_name|duration|
---------+--------+
Task 1 | 5|
Task 2 | 7|
Task 3 | 3|
8.5 - DATE_ADD
Add the time interval or date interval to the provided date or date with time (timestamp/datetime).
Analyze Syntax
func.date_add(<unit>, <value>, <date_or_time_expr>)
Analyze Examples
func.date_add('YEAR', 1, func.to_date('2018-01-02'))
+------------------------------------------------------+
| func.date_add('YEAR', 1, func.to_date('2018-01-02')) |
+------------------------------------------------------+
| 2019-01-02 |
+------------------------------------------------------+
SQL Syntax
DATE_ADD(<unit>, <value>, <date_or_time_expr>)
Arguments
Arguments | Description |
---|
<unit> | Must be of the following values: YEAR , QUARTER , MONTH , DAY , HOUR , MINUTE and SECOND |
<value> | This is the number of units of time that you want to add. For example, if you want to add 2 days, this will be 2. |
<date_or_time_expr> | A value of DATE or TIMESTAMP type |
Return Type
The function returns a value of the same type as the <date_or_time_expr>
argument.
SQL Examples
Query:
SELECT date_add(YEAR, 1, to_date('2018-01-02'));
+---------------------------------------------------+
| DATE_ADD(YEAR, INTERVAL 1, to_date('2018-01-02')) |
+---------------------------------------------------+
| 2019-01-02 |
+---------------------------------------------------+
8.6 - DATE_FORMAT
Alias for TO_STRING.
8.7 - DATE_PART
Retrieves the designated portion of a date, time, or timestamp.
See also: EXTRACT
Analyze Syntax
func.date_part(<unit>, <date_or_time_expr>)
Analyze Examples
func.now() |
---------------------+
2023-10-16 02:09:28.0|
func.date_part('day', now())
func.date_part('day', now())|
----------------------------+
16 |
SQL Syntax
DATE_PART( YEAR | QUARTER | MONTH | WEEK | DAY | HOUR | MINUTE | SECOND | DOW | DOY, <date_or_time_expr> )
- DOW: Day of Week.
- DOY: Day of Year.
Return Type
Integer.
SQL Examples
SELECT NOW();
now() |
---------------------+
2023-10-16 02:09:28.0|
SELECT DATE_PART(DAY, NOW());
date_part(day, now())|
---------------------+
16|
-- October 16, 2023, is a Monday
SELECT DATE_PART(DOW, NOW());
date_part(dow, now())|
---------------------+
1|
-- October 16, 2023, is the 289th day of the year
SELECT DATE_PART(DOY, NOW());
date_part(doy, now())|
---------------------+
289|
SELECT DATE_PART(MONTH, TO_DATE('2022-05-13'));
date_part(month, to_date('2022-05-13'))|
---------------------------------------+
5|
8.8 - DATE_SUB
Subtract the time interval or date interval from the provided date or date with time (timestamp/datetime).
Analyze Syntax
func.date_sub(<unit>, <value>, <date_or_time_expr>)
Analyze Examples
func.date_sub('YEAR', 1, func.to_date('2018-01-02'))
+------------------------------------------------------+
| func.date_sub('YEAR', 1, func.to_date('2018-01-02')) |
+------------------------------------------------------+
| 2017-01-02 |
+------------------------------------------------------+
SQL Syntax
DATE_SUB(<unit>, <value>, <date_or_time_expr>)
Arguments
Arguments | Description |
---|
<unit> | Must be of the following values: YEAR , QUARTER , MONTH , DAY , HOUR , MINUTE and SECOND |
<value> | This is the number of units of time that you want to add. For example, if you want to add 2 days, this will be 2. |
<date_or_time_expr> | A value of DATE or TIMESTAMP type |
Return Type
The function returns a value of the same type as the <date_or_time_expr>
argument.
SQL Examples
Query:
SELECT date_sub(YEAR, 1, to_date('2018-01-02'));
+---------------------------------------------------+
| DATE_SUB(YEAR, INTERVAL 1, to_date('2018-01-02')) |
+---------------------------------------------------+
| 2017-01-02 |
+---------------------------------------------------+
8.9 - DATE_TRUNC
Truncates a date, time, or timestamp value to a specified precision. For example, if you truncate 2022-07-07
to MONTH
, the result will be 2022-07-01
; if you truncate 2022-07-07 01:01:01.123456
to SECOND
, the result will be 2022-07-07 01:01:01.000000
.
Analyze Syntax
func.date_sub(<precision>, <date_or_time_expr>)
Analyze Examples
func.date_trunc('month', func.to_date('2022-07-07'))
+------------------------------------------------------+
| func.date_trunc('month', func.to_date('2022-07-07')) |
+------------------------------------------------------+
| 2022-07-01 |
+------------------------------------------------------+
SQL Syntax
DATE_TRUNC(<precision>, <date_or_time_expr>)
Arguments
Arguments | Description |
---|
<precision> | Must be of the following values: YEAR , QUARTER , MONTH , DAY , HOUR , MINUTE and SECOND |
<date_or_time_expr> | A value of DATE or TIMESTAMP type |
Return Type
The function returns a value of the same type as the <date_or_time_expr>
argument.
SQL Examples
select date_trunc(month, to_date('2022-07-07'));
+------------------------------------------+
| date_trunc(month, to_date('2022-07-07')) |
+------------------------------------------+
| 2022-07-01 |
+------------------------------------------+
8.10 - DAY
Alias for TO_DAY_OF_MONTH.
8.11 - EXTRACT
Retrieves the designated portion of a date, time, or timestamp.
See also: DATE_PART
SQL Syntax
EXTRACT( YEAR | QUARTER | MONTH | WEEK | DAY | HOUR | MINUTE | SECOND | DOW | DOY FROM <date_or_time_expr> )
- DOW: Day of the Week.
- DOY: Day of Year.
Return Type
Integer.
SQL Examples
SELECT NOW();
now() |
---------------------+
2023-10-16 02:09:28.0|
SELECT EXTRACT(DAY FROM NOW());
extract(day from now())|
-----------------------+
16|
-- October 16, 2023, is a Monday
SELECT EXTRACT(DOW FROM NOW());
extract(dow from now())|
-----------------------+
1|
-- October 16, 2023, is the 289th day of the year
SELECT EXTRACT(DOY FROM NOW());
extract(doy from now())|
-----------------------+
289|
SELECT EXTRACT(MONTH FROM TO_DATE('2022-05-13'));
extract(month from to_date('2022-05-13'))|
-----------------------------------------+
5|
8.12 - MONTH
Alias for TO_MONTH.
8.13 - NOW
Returns the current date and time.
Analyze Syntax
Analyze Examples
┌─────────────────────────────────────────────────────────┐
│ func.current_timestamp() │ func.now() │
├────────────────────────────┼────────────────────────────┤
│ 2024-01-29 04:38:12.584359 │ 2024-01-29 04:38:12.584417 │
└─────────────────────────────────────────────────────────┘
SQL Syntax
Return Type
TIMESTAMP
Aliases
SQL Examples
This example returns the current date and time:
SELECT CURRENT_TIMESTAMP(), NOW();
┌─────────────────────────────────────────────────────────┐
│ current_timestamp() │ now() │
├────────────────────────────┼────────────────────────────┤
│ 2024-01-29 04:38:12.584359 │ 2024-01-29 04:38:12.584417 │
└─────────────────────────────────────────────────────────┘
8.14 - QUARTER
Alias for TO_QUARTER.
8.15 - STR_TO_DATE
Alias for TO_DATE.
8.16 - STR_TO_TIMESTAMP
Alias for TO_TIMESTAMP.
8.17 - SUBTRACT TIME INTERVAL
Subtract time interval function
Subtract time interval from a date or timestamp, return the result of date or timestamp type.
Analyze Syntax
func.subtract_years(<exp0>, <expr1>)
func.subtract_quarters(<exp0>, <expr1>)
func.subtract_months(<exp0>, <expr1>)
func.subtract_days(<exp0>, <expr1>)
func.subtract_hours(<exp0>, <expr1>)
func.subtract_minutes(<exp0>, <expr1>)
func.subtract_seconds(<exp0>, <expr1>)
Analyze Examples
func.to_date(18875), func.subtract_years(func.to_date(18875), 2)
+---------------------------------+--------------------------------------------------------+
| func.to_date(18875) | func.subtract_years(func.to_date(18875), 2) |
+---------------------------------+--------------------------------------------------------+
| 2021-09-05 | 2019-09-05 |
+---------------------------------+--------------------------------------------------------+
func.to_date(18875), func.subtract_quarters(func.to_date(18875), 2)
+---------------------------------+--------------------------------------------------------+
| func.to_date(18875) | subtract_quarters(func.to_date(18875), 2) |
+---------------------------------+--------------------------------------------------------+
| 2021-09-05 | 2021-03-05 |
+---------------------------------+--------------------------------------------------------+
func.to_date(18875), func.subtract_months(func.to_date(18875), 2)
+---------------------------------+--------------------------------------------------------+
| func.to_date(18875) | func.subtract_months(func.to_date(18875), 2) |
+---------------------------------+--------------------------------------------------------+
| 2021-09-05 | 2021-07-05 |
+---------------------------------+--------------------------------------------------------+
func.to_date(18875), func.subtract_days(func.to_date(18875), 2)
+---------------------------------+--------------------------------------------------------+
| func.to_date(18875) | func.subtract_days(func.to_date(18875), 2) |
+---------------------------------+--------------------------------------------------------+
| 2021-09-05 | 2021-09-03 |
+---------------------------------+--------------------------------------------------------+
func.to_datetime(1630833797), func.subtract_hours(func.to_datetime(1630833797), 2)
+---------------------------------+--------------------------------------------------------+
| func.to_datetime(1630833797) | func.subtract_hours(func.to_datetime(1630833797), 2) |
+---------------------------------+--------------------------------------------------------+
| 2021-09-05 09:23:17.000000 | 2021-09-05 07:23:17.000000 |
+---------------------------------+--------------------------------------------------------+
func.to_datetime(1630833797), func.subtract_minutes(func.to_datetime(1630833797), 2)
+---------------------------------+--------------------------------------------------------+
| func.to_datetime(1630833797) | func.subtract_minutes(func.to_datetime(1630833797), 2) |
+---------------------------------+--------------------------------------------------------+
| 2021-09-05 09:23:17.000000 | 2021-09-05 09:21:17.000000 |
+---------------------------------+--------------------------------------------------------+
func.to_datetime(1630833797), func.subtract_seconds(func.to_datetime(1630833797), 2)
+---------------------------------+--------------------------------------------------------+
| func.to_datetime(1630833797) | func.subtract_seconds(func.to_datetime(1630833797), 2) |
+---------------------------------+--------------------------------------------------------+
| 2021-09-05 09:23:17.000000 | 2021-09-05 09:23:15.000000 |
+---------------------------------+--------------------------------------------------------+
SQL Syntax
SUBTRACT_YEARS(<exp0>, <expr1>)
SUBTRACT_QUARTERS(<exp0>, <expr1>)
SUBTRACT_MONTHS(<exp0>, <expr1>)
SUBTRACT_DAYS(<exp0>, <expr1>)
SUBTRACT_HOURS(<exp0>, <expr1>)
SUBTRACT_MINUTES(<exp0>, <expr1>)
SUBTRACT_SECONDS(<exp0>, <expr1>)
Return Type
DATE
, TIMESTAMP
depends on the input.
SQL Examples
SELECT to_date(18875), subtract_years(to_date(18875), 2);
+----------------+-----------------------------------+
| to_date(18875) | subtract_years(to_date(18875), 2) |
+----------------+-----------------------------------+
| 2021-09-05 | 2019-09-05 |
+----------------+-----------------------------------+
SELECT to_date(18875), subtract_quarters(to_date(18875), 2);
+----------------+--------------------------------------+
| to_date(18875) | subtract_quarters(to_date(18875), 2) |
+----------------+--------------------------------------+
| 2021-09-05 | 2021-03-05 |
+----------------+--------------------------------------+
SELECT to_date(18875), subtract_months(to_date(18875), 2);
+----------------+------------------------------------+
| to_date(18875) | subtract_months(to_date(18875), 2) |
+----------------+------------------------------------+
| 2021-09-05 | 2021-07-05 |
+----------------+------------------------------------+
SELECT to_date(18875), subtract_days(to_date(18875), 2);
+----------------+----------------------------------+
| to_date(18875) | subtract_days(to_date(18875), 2) |
+----------------+----------------------------------+
| 2021-09-05 | 2021-09-03 |
+----------------+----------------------------------+
SELECT to_datetime(1630833797), subtract_hours(to_datetime(1630833797), 2);
+----------------------------+--------------------------------------------+
| to_datetime(1630833797) | subtract_hours(to_datetime(1630833797), 2) |
+----------------------------+--------------------------------------------+
| 2021-09-05 09:23:17.000000 | 2021-09-05 07:23:17.000000 |
+----------------------------+--------------------------------------------+
SELECT to_datetime(1630833797), subtract_minutes(to_datetime(1630833797), 2);
+----------------------------+----------------------------------------------+
| to_datetime(1630833797) | subtract_minutes(to_datetime(1630833797), 2) |
+----------------------------+----------------------------------------------+
| 2021-09-05 09:23:17.000000 | 2021-09-05 09:21:17.000000 |
+----------------------------+----------------------------------------------+
SELECT to_datetime(1630833797), subtract_seconds(to_datetime(1630833797), 2);
+----------------------------+----------------------------------------------+
| to_datetime(1630833797) | subtract_seconds(to_datetime(1630833797), 2) |
+----------------------------+----------------------------------------------+
| 2021-09-05 09:23:17.000000 | 2021-09-05 09:23:15.000000 |
+----------------------------+----------------------------------------------+
8.18 - TIME_SLOT
Rounds the time to the half-hour.
Analyze Syntax
Analyze Examples
func.time_slot('2023-11-12 09:38:18.165575')
┌───────────────────────────────-───-───-──────┐
│ func.time_slot('2023-11-12 09:38:18.165575') │
│ Timestamp │
├─────────────────────────────────-───-────────┤
│ 2023-11-12 09:30:00 │
└─────────────────────────────────-───-────────┘
SQL Syntax
Arguments
Arguments | Description |
---|
<expr> | timestamp |
Return Type
TIMESTAMP
, returns in “YYYY-MM-DD hh:mm:ss.ffffff” format.
SQL Examples
SELECT
time_slot('2023-11-12 09:38:18.165575')
┌─────────────────────────────────────────┐
│ time_slot('2023-11-12 09:38:18.165575') │
│ Timestamp │
├─────────────────────────────────────────┤
│ 2023-11-12 09:30:00 │
└─────────────────────────────────────────┘
8.19 - TIMEZONE
Returns the timezone for the current connection.
PlaidCloud Lakehouse uses UTC (Coordinated Universal Time) as the default timezone and allows you to change the timezone to your current geographic location. For the available values you can assign to the timezone
setting, refer to https://docs.rs/chrono-tz/latest/chrono_tz/enum.Tz.html. See the examples below for details.
Analyze Syntax
Analyze Examples
func.timezone()
┌─────────────────────┐
│ timezone │
├─────────────────────┤
│ UTC │
└─────────────────────┘
SQL Syntax
SELECT TIMEZONE();
SQL Examples
-- Return the current timezone
SELECT TIMEZONE();
+-----------------+
| TIMEZONE('UTC') |
+-----------------+
| UTC |
+-----------------+
-- Set the timezone to China Standard Time
SET timezone='Asia/Shanghai';
SELECT TIMEZONE();
+---------------------------+
| TIMEZONE('Asia/Shanghai') |
+---------------------------+
| Asia/Shanghai |
+---------------------------+
8.20 - TO_DATE
Converts an expression to a date, including:
Converting a timestamp-format string to a date: Extracts a date from the given string.
Converting an integer to a date: Interprets the integer as the number of days before (for negative numbers) or after (for positive numbers) the Unix epoch (midnight on January 1, 1970). Please note that a Date value ranges from 1000-01-01
to 9999-12-31
. PlaidCloud Lakehouse would return an error if you run "SELECT TO_DATE(9999999999999999999)".
Converting a string to a date using the specified format: The function takes two arguments, converting the first string to a date based on the format specified in the second string. To customize the date and time format in PlaidCloud Lakehouse, specifiers can be used. For a comprehensive list of supported specifiers, see Formatting Date and Time.
See also: TO_TIMESTAMP
Analyze Syntax
func.to_date('<timestamp_expr>')
func.to_date(<integer>)
func.to_date('<string>', '<format>')
Analyze Examples
func.typeof(func.to_date('2022-01-02')), func.typeof(func.str_to_date('2022-01-02'))
┌───────────────────────────────────────────────────────────────────────────────────────┐
│ func.typeof(func.to_date('2022-01-02')) │ func.typeof(func.str_to_date('2022-01-02')) │
├─────────────────────────────────────────┼─────────────────────────────────────────────┤
│ DATE │ DATE │
└───────────────────────────────────────────────────────────────────────────────────────┘
SQL Syntax
-- Convert a timestamp-format string
TO_DATE('<timestamp_expr>')
-- Convert an integer
TO_DATE(<integer>)
-- Convert a string using the given format
TO_DATE('<string>', '<format>')
Aliases
Return Type
The function returns a date in the format "YYYY-MM-DD":
SELECT TYPEOF(TO_DATE('2022-01-02')), TYPEOF(STR_TO_DATE('2022-01-02'));
┌───────────────────────────────────────────────────────────────────┐
│ typeof(to_date('2022-01-02')) │ typeof(str_to_date('2022-01-02')) │
├───────────────────────────────┼───────────────────────────────────┤
│ DATE │ DATE │
└───────────────────────────────────────────────────────────────────┘
To convert the returned date back to a string, use the DATE_FORMAT function:
SELECT DATE_FORMAT(TO_DATE('2022-01-02')) AS dt, TYPEOF(dt);
┌─────────────────────────┐
│ dt │ typeof(dt) │
├────────────┼────────────┤
│ 2022-01-02 │ VARCHAR │
└─────────────────────────┘
SQL Examples
SELECT TO_DATE('2022-01-02T01:12:00+07:00'), STR_TO_DATE('2022-01-02T01:12:00+07:00');
┌─────────────────────────────────────────────────────────────────────────────────┐
│ to_date('2022-01-02t01:12:00+07:00') │ str_to_date('2022-01-02t01:12:00+07:00') │
├──────────────────────────────────────┼──────────────────────────────────────────┤
│ 2022-01-01 │ 2022-01-01 │
└─────────────────────────────────────────────────────────────────────────────────┘
SELECT TO_DATE('2022-01-02'), STR_TO_DATE('2022-01-02');
┌───────────────────────────────────────────────────┐
│ to_date('2022-01-02') │ str_to_date('2022-01-02') │
├───────────────────────┼───────────────────────────┤
│ 2022-01-02 │ 2022-01-02 │
└───────────────────────────────────────────────────┘
SQL Examples 2: Converting an Integer
SELECT TO_DATE(1), STR_TO_DATE(1), TO_DATE(-1), STR_TO_DATE(-1);
┌───────────────────────────────────────────────────────────────────┐
│ to_date(1) │ str_to_date(1) │ to_date((- 1)) │ str_to_date((- 1)) │
│ Date │ Date │ Date │ Date │
├────────────┼────────────────┼────────────────┼────────────────────┤
│ 1970-01-02 │ 1970-01-02 │ 1969-12-31 │ 1969-12-31 │
└───────────────────────────────────────────────────────────────────┘
SELECT TO_DATE('12/25/2022','%m/%d/%Y'), STR_TO_DATE('12/25/2022','%m/%d/%Y');
┌───────────────────────────────────────────────────────────────────────────┐
│ to_date('12/25/2022', '%m/%d/%y') │ str_to_date('12/25/2022', '%m/%d/%y') │
├───────────────────────────────────┼───────────────────────────────────────┤
│ 2022-12-25 │ 2022-12-25 │
└───────────────────────────────────────────────────────────────────────────┘
8.21 - TO_DATETIME
Alias for TO_TIMESTAMP.
8.22 - TO_DAY_OF_MONTH
Convert a date or date with time (timestamp/datetime) to a UInt8 number containing the number of the day of the month (1-31).
Analyze Syntax
func.to_day_of_month(<expr>)
Analyze Examples
func.now(), func.to_day_of_month(func.now()), func.day(func.now())
┌──────────────────────────────────────────────────────────────────────────────────────┐
│ func.now() │ func.to_day_of_month(func.now()) │ func.day(func.now()) │
├────────────────────────────┼──────────────────────────────────┼──────────────────────┤
│ 2024-03-14 23:35:41.947962 │ 14 │ 14 │
└──────────────────────────────────────────────────────────────────────────────────────┘
SQL Syntax
Arguments
Arguments | Description |
---|
<expr> | date/timestamp |
Aliases
Return Type
TINYINT
SQL Examples
SELECT NOW(), TO_DAY_OF_MONTH(NOW()), DAY(NOW());
┌──────────────────────────────────────────────────────────────────┐
│ now() │ to_day_of_month(now()) │ day(now()) │
├────────────────────────────┼────────────────────────┼────────────┤
│ 2024-03-14 23:35:41.947962 │ 14 │ 14 │
└──────────────────────────────────────────────────────────────────┘
8.23 - TO_DAY_OF_WEEK
Converts a date or date with time (timestamp/datetime) to a UInt8 number containing the number of the day of the week (Monday is 1, and Sunday is 7).
Analyze Syntax
func.to_day_of_week(<expr>)
Analyze Examples
func.to_day_of_week('2023-11-12 09:38:18.165575')
┌────────────────────────────────────────────────────┐
│ func.to_day_of_week('2023-11-12 09:38:18.165575') │
│ UInt8 │
├────────────────────────────────────────────────────┤
│ 7 │
└────────────────────────────────────────────────────┘
SQL Syntax
Arguments
Arguments | Description |
---|
<expr> | date/timestamp |
Return Type
``TINYINT`
SQL Examples
SELECT
to_day_of_week('2023-11-12 09:38:18.165575')
┌──────────────────────────────────────────────┐
│ to_day_of_week('2023-11-12 09:38:18.165575') │
│ UInt8 │
├──────────────────────────────────────────────┤
│ 7 │
└──────────────────────────────────────────────┘
8.24 - TO_DAY_OF_YEAR
Convert a date or date with time (timestamp/datetime) to a UInt16 number containing the number of the day of the year (1-366).
Analyze Syntax
func.to_day_of_year(<expr>)
Analyze Examples
func.to_day_of_week('2023-11-12 09:38:18.165575')
┌────────────────────────────────────────────────────┐
│ func.to_day_of_year('2023-11-12 09:38:18.165575') │
│ UInt8 │
├────────────────────────────────────────────────────┤
│ 316 │
└────────────────────────────────────────────────────┘
SQL Syntax
Arguments
Arguments | Description |
---|
<expr> | date/timestamp |
Return Type
SMALLINT
SQL Examples
SELECT
to_day_of_year('2023-11-12 09:38:18.165575')
┌──────────────────────────────────────────────┐
│ to_day_of_year('2023-11-12 09:38:18.165575') │
│ UInt16 │
├──────────────────────────────────────────────┤
│ 316 │
└──────────────────────────────────────────────┘
8.25 - TO_HOUR
Converts a date with time (timestamp/datetime) to a UInt8 number containing the number of the hour in 24-hour time (0-23).
This function assumes that if clocks are moved ahead, it is by one hour and occurs at 2 a.m., and if clocks are moved back, it is by one hour and occurs at 3 a.m. (which is not always true – even in Moscow the clocks were twice changed at a different time).
Analyze Syntax
Analyze Examples
func.to_hour('2023-11-12 09:38:18.165575')
┌────────────────────────────────────────────────────┐
│ func.to_hour('2023-11-12 09:38:18.165575') │
│ UInt8 │
├────────────────────────────────────────────────────┤
│ 9 │
└────────────────────────────────────────────────────┘
SQL Syntax
Arguments
Arguments | Description |
---|
<expr> | timestamp |
Return Type
TINYINT
SQL Examples
SELECT
to_hour('2023-11-12 09:38:18.165575')
┌───────────────────────────────────────┐
│ to_hour('2023-11-12 09:38:18.165575') │
│ UInt8 │
├───────────────────────────────────────┤
│ 9 │
└───────────────────────────────────────┘
8.26 - TO_MINUTE
Converts a date with time (timestamp/datetime) to a UInt8 number containing the number of the minute of the hour (0-59).
Analyze Syntax
Analyze Examples
func.to_minute('2023-11-12 09:38:18.165575')
┌────────────────────────────────────────────────────┐
│ func.to_minute('2023-11-12 09:38:18.165575') │
│ UInt8 │
├────────────────────────────────────────────────────┤
│ 38 │
└────────────────────────────────────────────────────┘
SQL Syntax
Arguments
Arguments | Description |
---|
<expr> | timestamp |
Return Type
TINYINT
SQL Examples
SELECT
to_minute('2023-11-12 09:38:18.165575')
┌─────────────────────────────────────────┐
│ to_minute('2023-11-12 09:38:18.165575') │
│ UInt8 │
├─────────────────────────────────────────┤
│ 38 │
└─────────────────────────────────────────┘
8.27 - TO_MONDAY
Round down a date or date with time (timestamp/datetime) to the nearest Monday.
Returns the date.
Analyze Syntax
Analyze Examples
func.to_monday('2023-11-12 09:38:18.165575')
┌────────────────────────────────────────────────────┐
│ func.to_monday('2023-11-12 09:38:18.165575') │
│ Date │
├────────────────────────────────────────────────────┤
│ 2023-11-06 │
└────────────────────────────────────────────────────┘
SQL Syntax
Arguments
Arguments | Description |
---|
<expr> | date/timestamp |
Return Type
DATE
, returns date in “YYYY-MM-DD” format.
SQL Examples
SELECT
to_monday('2023-11-12 09:38:18.165575')
┌─────────────────────────────────────────┐
│ to_monday('2023-11-12 09:38:18.165575') │
│ Date │
├─────────────────────────────────────────┤
│ 2023-11-06 │
└─────────────────────────────────────────┘
8.28 - TO_MONTH
Convert a date or date with time (timestamp/datetime) to a UInt8 number containing the month number (1-12).
Analyze Syntax
Analyze Examples
func.now(), func.to_month(func.now()), func.month(func.now())
┌─────────────────────────────────────────────────────────────────────────────────┐
│ func.now() │ func.to_month(func.now()) │ func.month(func.now()) │
├────────────────────────────┼───────────────────────────┼────────────────────────┤
│ 2024-03-14 23:34:02.161291 │ 3 │ 3 │
└─────────────────────────────────────────────────────────────────────────────────┘
SQL Syntax
Arguments
Arguments | Description |
---|
<expr> | date/timestamp |
Aliases
Return Type
TINYINT
SQL Examples
SELECT NOW(), TO_MONTH(NOW()), MONTH(NOW());
┌─────────────────────────────────────────────────────────────┐
│ now() │ to_month(now()) │ month(now()) │
├────────────────────────────┼─────────────────┼──────────────┤
│ 2024-03-14 23:34:02.161291 │ 3 │ 3 │
└─────────────────────────────────────────────────────────────┘
8.29 - TO_QUARTER
Retrieves the quarter (1, 2, 3, or 4) from a given date or timestamp.
Analyze Syntax
Analyze Examples
func.now(), func.to_quarter(func.now()), func.quarter(func.now())
┌─────────────────────────────────────────────────────────────────────────────────────┐
│ func.now() │ func.to_quarter(func.now()) │ func.quarter(func.now()) │
├────────────────────────────┼─────────────────────────────┼──────────────────────────┤
│ 2024-03-14 23:32:52.743133 │ 3 │ 3 │
└─────────────────────────────────────────────────────────────────────────────────────┘
SQL Syntax
TO_QUARTER( <date_or_time_expr> )
Aliases
Return Type
Integer.
SQL Examples
SELECT NOW(), TO_QUARTER(NOW()), QUARTER(NOW());
┌─────────────────────────────────────────────────────────────────┐
│ now() │ to_quarter(now()) │ quarter(now()) │
├────────────────────────────┼───────────────────┼────────────────┤
│ 2024-03-14 23:32:52.743133 │ 1 │ 1 │
└─────────────────────────────────────────────────────────────────┘
8.30 - TO_SECOND
Converts a date with time (timestamp/datetime) to a UInt8 number containing the number of the second in the minute (0-59).
Analyze Syntax
Analyze Examples
func.to_second('2023-11-12 09:38:18.165575')
┌──────────────────────────────────────────────┐
│ func.to_second('2023-11-12 09:38:18.165575') │
│ UInt8 │
├──────────────────────────────────────────────┤
│ 18 │
└──────────────────────────────────────────────┘
SQL Syntax
Arguments
Arguments | Description |
---|
<expr> | timestamp |
Return Type
TINYINT
SQL Examples
SELECT
to_second('2023-11-12 09:38:18.165575')
┌─────────────────────────────────────────┐
│ to_second('2023-11-12 09:38:18.165575') │
│ UInt8 │
├─────────────────────────────────────────┤
│ 18 │
└─────────────────────────────────────────┘
8.31 - TO_START_OF_DAY
Rounds down a date with time (timestamp/datetime) to the start of the day.
Analyze Syntax
func.to_start_of_day(<expr>)
Analyze Examples
func.to_start_of_day('2023-11-12 09:38:18.165575')
┌────────────────────────────────────────────────────┐
│ func.to_start_of_day('2023-11-12 09:38:18.165575') │
│ Timestamp │
├────────────────────────────────────────────────────┤
│ 2023-11-12 00:00:00 │
└────────────────────────────────────────────────────┘
SQL Syntax
TO_START_OF_DAY( <expr> )
Arguments
Arguments | Description |
---|
<expr> | timestamp |
Return Type
TIMESTAMP
, returns date in “YYYY-MM-DD hh:mm:ss.ffffff” format.
SQL Examples
SELECT
to_start_of_day('2023-11-12 09:38:18.165575')
┌───────────────────────────────────────────────┐
│ to_start_of_day('2023-11-12 09:38:18.165575') │
│ Timestamp │
├───────────────────────────────────────────────┤
│ 2023-11-12 00:00:00 │
└───────────────────────────────────────────────┘
8.32 - TO_START_OF_FIFTEEN_MINUTES
Rounds down the date with time (timestamp/datetime) to the start of the fifteen-minute interval.
Analyze Syntax
func.to_start_of_fifteen_minutes(<expr>)
Analyze Examples
func.to_start_of_fifteen_minutes('2023-11-12 09:38:18.165575')
┌────────────────────────────────────────────────────────────────┐
│ func.to_start_of_fifteen_minutes('2023-11-12 09:38:18.165575') │
│ Timestamp │
├────────────────────────────────────────────────────────────────┤
│ 2023-11-12 09:30:00 │
└────────────────────────────────────────────────────────────────┘
SQL Syntax
TO_START_OF_FIFTEEN_MINUTES(<expr>)
Arguments
Arguments | Description |
---|
<expr> | timestamp |
Return Type
TIMESTAMP
, returns date in “YYYY-MM-DD hh:mm:ss.ffffff” format.
SQL Examples
SELECT
to_start_of_fifteen_minutes('2023-11-12 09:38:18.165575')
┌───────────────────────────────────────────────────────────┐
│ to_start_of_fifteen_minutes('2023-11-12 09:38:18.165575') │
│ Timestamp │
├───────────────────────────────────────────────────────────┤
│ 2023-11-12 09:30:00 │
└───────────────────────────────────────────────────────────┘
8.33 - TO_START_OF_FIVE_MINUTES
Rounds down a date with time (timestamp/datetime) to the start of the five-minute interval.
Analyze Syntax
func.to_start_of_five_minutes(<expr>)
Analyze Examples
func.to_start_of_five_minutes('2023-11-12 09:38:18.165575')
┌────────────────────────────────────────────────────────────────┐
│ func.to_start_of_five_minutes('2023-11-12 09:38:18.165575') │
│ Timestamp │
├────────────────────────────────────────────────────────────────┤
│ 2023-11-12 09:35:00 │
└────────────────────────────────────────────────────────────────┘
SQL Syntax
TO_START_OF_FIVE_MINUTES(<expr>)
Arguments
Arguments | Description |
---|
<expr> | timestamp |
Return Type
TIMESTAMP
, returns date in “YYYY-MM-DD hh:mm:ss.ffffff” format.
SQL Examples
SELECT
to_start_of_five_minutes('2023-11-12 09:38:18.165575')
┌────────────────────────────────────────────────────────┐
│ to_start_of_five_minutes('2023-11-12 09:38:18.165575') │
│ Timestamp │
├────────────────────────────────────────────────────────┤
│ 2023-11-12 09:35:00 │
└────────────────────────────────────────────────────────┘
8.34 - TO_START_OF_HOUR
Rounds down a date with time (timestamp/datetime) to the start of the hour.
Analyze Syntax
func.to_start_of_hour(<expr>)
Analyze Examples
func.to_start_of_hour('2023-11-12 09:38:18.165575')
┌────────────────────────────────────────────────────────────────┐
│ func.to_start_of_hour('2023-11-12 09:38:18.165575') │
│ Timestamp │
├────────────────────────────────────────────────────────────────┤
│ 2023-11-12 09:00:00 │
└────────────────────────────────────────────────────────────────┘
SQL Syntax
Arguments
Arguments | Description |
---|
<expr> | timestamp |
Return Type
TIMESTAMP
, returns date in “YYYY-MM-DD hh:mm:ss.ffffff” format.
SQL Examples
SELECT
to_start_of_hour('2023-11-12 09:38:18.165575')
┌────────────────────────────────────────────────┐
│ to_start_of_hour('2023-11-12 09:38:18.165575') │
│ Timestamp │
├────────────────────────────────────────────────┤
│ 2023-11-12 09:00:00 │
└────────────────────────────────────────────────┘
8.35 - TO_START_OF_ISO_YEAR
Returns the first day of the ISO year for a date or a date with time (timestamp/datetime).
Analyze Syntax
func.to_start_of_iso_year(<expr>)
Analyze Examples
func.to_start_of_iso_year('2023-11-12 09:38:18.165575')
┌────────────────────────────────────────────────────────────────┐
│ func.to_start_of_iso_year('2023-11-12 09:38:18.165575') │
│ Date │
├────────────────────────────────────────────────────────────────┤
│ 2023-01-02 │
└────────────────────────────────────────────────────────────────┘
SQL Syntax
TO_START_OF_ISO_YEAR(<expr>)
Arguments
Arguments | Description |
---|
<expr> | date/timestamp |
Return Type
DATE
, returns date in “YYYY-MM-DD” format.
SQL Examples
SELECT
to_start_of_iso_year('2023-11-12 09:38:18.165575')
┌────────────────────────────────────────────────────┐
│ to_start_of_iso_year('2023-11-12 09:38:18.165575') │
│ Date │
├────────────────────────────────────────────────────┤
│ 2023-01-02 │
└────────────────────────────────────────────────────┘
8.36 - TO_START_OF_MINUTE
Rounds down a date with time (timestamp/datetime) to the start of the minute.
Analyze Syntax
func.to_start_of_minute(<expr>)
Analyze Examples
func.to_start_of_minute('2023-11-12 09:38:18.165575')
┌────────────────────────────────────────────────────────────────┐
│ func.to_start_of_minute('2023-11-12 09:38:18.165575') │
│ Timestamp │
├────────────────────────────────────────────────────────────────┤
│ 2023-11-12 09:38:00 │
└────────────────────────────────────────────────────────────────┘
SQL Syntax
TO_START_OF_MINUTE( <expr> )
Arguments
Arguments | Description |
---|
<expr> | timestamp |
Return Type
TIMESTAMP
, returns date in “YYYY-MM-DD hh:mm:ss.ffffff” format.
SQL Examples
SELECT
to_start_of_minute('2023-11-12 09:38:18.165575')
┌──────────────────────────────────────────────────┐
│ to_start_of_minute('2023-11-12 09:38:18.165575') │
│ Timestamp │
├──────────────────────────────────────────────────┤
│ 2023-11-12 09:38:00 │
└──────────────────────────────────────────────────┘
8.37 - TO_START_OF_MONTH
Rounds down a date or date with time (timestamp/datetime) to the first day of the month.
Returns the date.
Analyze Syntax
func.to_start_of_month(<expr>)
Analyze Examples
func.to_start_of_month('2023-11-12 09:38:18.165575')
┌────────────────────────────────────────────────────────────────┐
│ func.to_start_of_month('2023-11-12 09:38:18.165575') │
│ Date │
├────────────────────────────────────────────────────────────────┤
│ 2023-11-01 │
└────────────────────────────────────────────────────────────────┘
SQL Syntax
TO_START_OF_MONTH(<expr>)
Arguments
Arguments | Description |
---|
<expr> | date/timestamp |
Return Type
DATE
, returns date in “YYYY-MM-DD” format.
SQL Examples
SELECT
to_start_of_month('2023-11-12 09:38:18.165575')
┌─────────────────────────────────────────────────┐
│ to_start_of_month('2023-11-12 09:38:18.165575') │
│ Date │
├─────────────────────────────────────────────────┤
│ 2023-11-01 │
└─────────────────────────────────────────────────┘
8.38 - TO_START_OF_QUARTER
Rounds down a date or date with time (timestamp/datetime) to the first day of the quarter.
The first day of the quarter is either 1 January, 1 April, 1 July, or 1 October.
Returns the date.
Analyze Syntax
func.to_start_of_quarter(<expr>)
Analyze Examples
func.to_start_of_quarter('2023-11-12 09:38:18.165575')
┌────────────────────────────────────────────────────────────────┐
│ func.to_start_of_quarter('2023-11-12 09:38:18.165575') │
│ Date │
├────────────────────────────────────────────────────────────────┤
│ 2023-10-01 │
└────────────────────────────────────────────────────────────────┘
SQL Syntax
TO_START_OF_QUARTER(<expr>)
Arguments
Arguments | Description |
---|
<expr> | date/timestamp |
Return Type
DATE
, returns date in “YYYY-MM-DD” format.
SQL Examples
SELECT
to_start_of_quarter('2023-11-12 09:38:18.165575')
┌───────────────────────────────────────────────────┐
│ to_start_of_quarter('2023-11-12 09:38:18.165575') │
│ Date │
├───────────────────────────────────────────────────┤
│ 2023-10-01 │
└───────────────────────────────────────────────────┘
8.39 - TO_START_OF_SECOND
Rounds down a date with time (timestamp/datetime) to the start of the second.
Analyze Syntax
func.to_start_of_second(<expr>)
Analyze Examples
func.to_start_of_second('2023-11-12 09:38:18.165575')
┌────────────────────────────────────────────────────────────────┐
│ func.to_start_of_second('2023-11-12 09:38:18.165575') │
│ Timestamp │
├────────────────────────────────────────────────────────────────┤
│ 2023-11-12 09:38:18 │
└────────────────────────────────────────────────────────────────┘
SQL Syntax
TO_START_OF_SECOND(<expr>)
Arguments
Arguments | Description |
---|
<expr> | timestamp |
Return Type
TIMESTAMP
, returns date in “YYYY-MM-DD hh:mm:ss.ffffff” format.
SQL Examples
SELECT
to_start_of_second('2023-11-12 09:38:18.165575')
┌──────────────────────────────────────────────────┐
│ to_start_of_second('2023-11-12 09:38:18.165575') │
│ Timestamp │
├──────────────────────────────────────────────────┤
│ 2023-11-12 09:38:18 │
└──────────────────────────────────────────────────┘
8.40 - TO_START_OF_TEN_MINUTES
Rounds down a date with time (timestamp/datetime) to the start of the ten-minute interval.
Analyze Syntax
func.to_start_of_ten_minutes(<expr>)
Analyze Examples
func.to_start_of_ten_minutes('2023-11-12 09:38:18.165575')
┌────────────────────────────────────────────────────────────────┐
│ func.to_start_of_ten_minutes('2023-11-12 09:38:18.165575') │
│ Timestamp │
├────────────────────────────────────────────────────────────────┤
│ 2023-11-12 09:30:00 │
└────────────────────────────────────────────────────────────────┘
SQL Syntax
TO_START_OF_TEN_MINUTES(<expr>)
Arguments
Arguments | Description |
---|
<expr> | timestamp |
Return Type
TIMESTAMP
, returns date in “YYYY-MM-DD hh:mm:ss.ffffff” format.
SQL Examples
SELECT
to_start_of_ten_minutes('2023-11-12 09:38:18.165575')
┌───────────────────────────────────────────────────────┐
│ to_start_of_ten_minutes('2023-11-12 09:38:18.165575') │
│ Timestamp │
├───────────────────────────────────────────────────────┤
│ 2023-11-12 09:30:00 │
└───────────────────────────────────────────────────────┘
8.41 - TO_START_OF_WEEK
Returns the first day of the week for a date or a date with time (timestamp/datetime).
The first day of a week can be Sunday or Monday, which is specified by the argument mode
.
Analyze Syntax
func.to_start_of_week(<expr>)
Analyze Examples
func.to_start_of_week('2023-11-12 09:38:18.165575')
┌────────────────────────────────────────────────────────────────┐
│ func.to_start_of_week('2023-11-12 09:38:18.165575') │
│ Date │
├────────────────────────────────────────────────────────────────┤
│ 2023-11-12 │
└────────────────────────────────────────────────────────────────┘
SQL Syntax
TO_START_OF_WEEK(<expr> [, mode])
Arguments
Arguments | Description |
---|
<expr> | date/timestamp |
[mode] | Optional. If it is 0, the result is Sunday, otherwise, the result is Monday. The default value is 0 |
Return Type
DATE
, returns date in “YYYY-MM-DD” format.
SQL Examples
SELECT
to_start_of_week('2023-11-12 09:38:18.165575')
┌────────────────────────────────────────────────┐
│ to_start_of_week('2023-11-12 09:38:18.165575') │
│ Date │
├────────────────────────────────────────────────┤
│ 2023-11-12 │
└────────────────────────────────────────────────┘
8.42 - TO_START_OF_YEAR
Returns the first day of the year for a date or a date with time (timestamp/datetime).
Analyze Syntax
func.to_start_of_year(<expr>)
Analyze Examples
func.to_start_of_year('2023-11-12 09:38:18.165575')
┌────────────────────────────────────────────────────────────────┐
│ func.to_start_of_year('2023-11-12 09:38:18.165575') │
│ Date │
├────────────────────────────────────────────────────────────────┤
│ 2023-01-01 │
└────────────────────────────────────────────────────────────────┘
SQL Syntax
Arguments
Arguments | Description |
---|
<expr> | date/timestamp |
Return Type
DATE
, returns date in “YYYY-MM-DD” format.
SQL Examples
SELECT
to_start_of_year('2023-11-12 09:38:18.165575')
┌────────────────────────────────────────────────┐
│ to_start_of_year('2023-11-12 09:38:18.165575') │
│ Date │
├────────────────────────────────────────────────┤
│ 2023-01-01 │
└────────────────────────────────────────────────┘
8.43 - TO_TIMESTAMP
TO_TIMESTAMP converts an expression to a date with time (timestamp/datetime).
The function can accept one or two arguments. If given one argument, the function extracts a date from the string. If the argument is an integer, the function interprets the integer as the number of seconds, milliseconds, or microseconds before (for a negative number) or after (for a positive number) the Unix epoch (midnight on January 1, 1970):
- If the integer is less than 31,536,000,000, it is treated as seconds.
- If the integer is greater than or equal to 31,536,000,000 and less than 31,536,000,000,000, it is treated as milliseconds.
- If the integer is greater than or equal to 31,536,000,000,000, it is treated as microseconds.
If given two arguments, the function converts the first string to a timestamp based on the format specified in the second string. To customize the format of date and time in PlaidCloud Lakehouse, you can utilize specifiers. These specifiers allow you to define the desired format for date and time values. For a comprehensive list of supported specifiers, see Formatting Date and Time.
- The output timestamp reflects your PlaidCloud Lakehouse timezone.
- The timezone information must be included in the string you want to convert, otherwise NULL will be returned.
See also: TO_DATE
Analyze Syntax
func.to_timestamp(<expr>)
Analyze Examples
func.to_timestamp('2022-01-02T03:25:02.868894-07:00')
┌────────────────────────────────────────────────────────────────┐
│ func.to_timestamp('2022-01-02T03:25:02.868894-07:00') │
│ Timestamp │
├────────────────────────────────────────────────────────────────┤
│ 2022-01-02 10:25:02.868894 │
└────────────────────────────────────────────────────────────────┘
SQL Syntax
-- Convert a string or integer to a timestamp
TO_TIMESTAMP(<expr>)
-- Convert a string to a timestamp using the given pattern
TO_TIMESTAMP(<expr, expr>)
Return Type
Returns a timestamp in the format "YYYY-MM-DD hh:mm:ss.ffffff". If the given string matches this format but does not have the time part, it is automatically extended to this pattern. The padding value is 0.
Aliases
SQL Examples
Given a String Argument
SELECT TO_TIMESTAMP('2022-01-02T03:25:02.868894-07:00');
---
2022-01-02 10:25:02.868894
SELECT TO_TIMESTAMP('2022-01-02 02:00:11');
---
2022-01-02 02:00:11.000000
SELECT TO_TIMESTAMP('2022-01-02T02:00:22');
---
2022-01-02 02:00:22.000000
SELECT TO_TIMESTAMP('2022-01-02T01:12:00-07:00');
---
2022-01-02 08:12:00.000000
SELECT TO_TIMESTAMP('2022-01-02T01');
---
2022-01-02 01:00:00.000000
Given an Integer Argument
SELECT TO_TIMESTAMP(1);
---
1970-01-01 00:00:01.000000
SELECT TO_TIMESTAMP(-1);
---
1969-12-31 23:59:59.000000
Note:Please note that a Timestamp value ranges from 1000-01-01 00:00:00.000000 to 9999-12-31 23:59:59.999999. PlaidCloud Lakehouse would return an error if you run the following statement:
SELECT TO_TIMESTAMP(9999999999999999999);
Given Two Arguments
SET GLOBAL timezone ='Japan';
SELECT TO_TIMESTAMP('2022 年 2 月 4 日、8 時 58 分 59 秒、タイムゾーン:+0900', '%Y年%m月%d日、%H時%M分%S秒、タイムゾーン:%z');
---
2022-02-04 08:58:59.000000
SET GLOBAL timezone ='America/Toronto';
SELECT TO_TIMESTAMP('2022 年 2 月 4 日、8 時 58 分 59 秒、タイムゾーン:+0900', '%Y年%m月%d日、%H時%M分%S秒、タイムゾーン:%z');
---
2022-02-03 18:58:59.000000
8.44 - TO_UNIX_TIMESTAMP
Converts a timestamp in a date/time format to a Unix timestamp format. A Unix timestamp represents the number of seconds that have elapsed since January 1, 1970, at 00:00:00 UTC.
Analyze Syntax
func.to_unix_timestamp(<expr>)
Analyze Examples
func.to_unix_timestamp('2023-11-12 09:38:18.165575')
┌────────────────────────────────────────────────────────────────┐
│ func.to_unix_timestamp('2023-11-12 09:38:18.165575') │
│ UInt32 │
├────────────────────────────────────────────────────────────────┤
│ 1699781898 │
└────────────────────────────────────────────────────────────────┘
SQL Syntax
TO_UNIX_TIMESTAMP(<expr>)
Arguments
Arguments | Description |
---|
<expr> | Timestamp |
For more information about the timestamp data type, see Date & Time.
Return Type
BIGINT
SQL Examples
SELECT
to_unix_timestamp('2023-11-12 09:38:18.165575')
┌─────────────────────────────────────────────────┐
│ to_unix_timestamp('2023-11-12 09:38:18.165575') │
│ UInt32 │
├─────────────────────────────────────────────────┤
│ 1699781898 │
└─────────────────────────────────────────────────┘
8.45 - TO_WEEK_OF_YEAR
Calculates the week number within a year for a given date.
ISO week numbering works as follows: January 4th is always considered part of the first week. If January 1st is a Thursday, then the week that spans from Monday, December 29th, to Sunday, January 4th, is designated as ISO week 1. If January 1st falls on a Friday, then the week that goes from Monday, January 4th, to Sunday, January 10th, is marked as ISO week 1.
Analyze Syntax
func.to_week_of_year(<expr>)
Analyze Examples
func.now(), func.to_week_of_year(func.now()), func.week(func.now()), func.weekofyear(func.now())
┌─────────────────────────────────────────────────────────────────────────────────────────────────────────────────────┐
│ func.now() │ func.to_week_of_year(func.now()) │ func.week(func.now()) │ func.weekofyear(func.now()) │
├────────────────────────────┼──────────────────────────────────┼───────────────────────┼─────────────────────────────┤
│ 2024-03-14 23:30:04.011624 │ 11 │ 11 │ 11 │
└─────────────────────────────────────────────────────────────────────────────────────────────────────────────────────┘
SQL Syntax
Arguments
Arguments | Description |
---|
<expr> | date/timestamp |
Aliases
Return Type
Returns an integer that represents the week number within a year, with numbering ranging from 1 to 53.
SQL Examples
SELECT NOW(), TO_WEEK_OF_YEAR(NOW()), WEEK(NOW()), WEEKOFYEAR(NOW());
┌───────────────────────────────────────────────────────────────────────────────────────┐
│ now() │ to_week_of_year(now()) │ week(now()) │ weekofyear(now()) │
├────────────────────────────┼────────────────────────┼─────────────┼───────────────────┤
│ 2024-03-14 23:30:04.011624 │ 11 │ 11 │ 11 │
└───────────────────────────────────────────────────────────────────────────────────────┘
8.46 - TO_YEAR
Converts a date or date with time (timestamp/datetime) to a UInt16 number containing the year number (AD).
Analyze Syntax
Analyze Examples
func.now(), func.to_year(func.now()), func.year(func.now())
┌───────────────────────────────────────────────────────────────────────────────┐
│ func.now() │ func.to_year(func.now()) │ func.year(func.now()) │
├────────────────────────────┼──────────────────────────┼───────────────────────┤
│ 2024-03-14 23:37:03.895166 │ 2024 │ 2024 │
└───────────────────────────────────────────────────────────────────────────────┘
SQL Syntax
Arguments
Arguments | Description |
---|
<expr> | date/timestamp |
Aliases
Return Type
SMALLINT
SQL Examples
SELECT NOW(), TO_YEAR(NOW()), YEAR(NOW());
┌───────────────────────────────────────────────────────────┐
│ now() │ to_year(now()) │ year(now()) │
├────────────────────────────┼────────────────┼─────────────┤
│ 2024-03-14 23:37:03.895166 │ 2024 │ 2024 │
└───────────────────────────────────────────────────────────┘
8.47 - TO_YYYYMM
Converts a date or date with time (timestamp/datetime) to a UInt32 number containing the year and month number.
Analyze Syntax
Analyze Examples
func.to_yyyymm('2023-11-12 09:38:18.165575')
┌──────────────────────────────────────────────┐
│ func.to_yyyymm('2023-11-12 09:38:18.165575') │
│ UInt32 │
├──────────────────────────────────────────────┤
│ 202311 │
└──────────────────────────────────────────────┘
SQL Syntax
Arguments
Arguments | Description |
---|
<expr> | date/timestamp |
Return Type
INT
, returns in YYYYMM
format.
SQL Examples
SELECT
to_yyyymm('2023-11-12 09:38:18.165575')
┌─────────────────────────────────────────┐
│ to_yyyymm('2023-11-12 09:38:18.165575') │
│ UInt32 │
├─────────────────────────────────────────┤
│ 202311 │
└─────────────────────────────────────────┘
8.48 - TO_YYYYMMDD
Converts a date or date with time (timestamp/datetime) to a UInt32 number containing the year and month number (YYYY * 10000 + MM * 100 + DD).
Analyze Syntax
Analyze Examples
func.to_yyyymmdd('2023-11-12 09:38:18.165575')
┌────────────────────────────────────────────────┐
│ func.to_yyyymmdd('2023-11-12 09:38:18.165575') │
│ UInt32 │
├────────────────────────────────────────────────┤
│ 20231112 │
└────────────────────────────────────────────────┘
SQL Syntax
Arguments
Arguments | Description |
---|
<expr> | date/datetime |
Return Type
INT
, returns in YYYYMMDD
format.
SQL Examples
SELECT
to_yyyymmdd('2023-11-12 09:38:18.165575')
┌───────────────────────────────────────────┐
│ to_yyyymmdd('2023-11-12 09:38:18.165575') │
│ UInt32 │
├───────────────────────────────────────────┤
│ 20231112 │
└───────────────────────────────────────────┘
8.49 - TO_YYYYMMDDHH
Formats a given date or timestamp into a string representation in the format "YYYYMMDDHH" (Year, Month, Day, Hour).
Analyze Syntax
func.to_yyyymmddhh(<expr>)
Analyze Examples
func.to_yyyymmddhh('2023-11-12 09:38:18.165575')
┌──────────────────────────────────────────────────┐
│ func.to_yyyymmddhh('2023-11-12 09:38:18.165575') │
│ UInt32 │
├──────────────────────────────────────────────────┤
│ 2023111209 │
└──────────────────────────────────────────────────┘
SQL Syntax
Arguments
Arguments | Description |
---|
<expr> | date/datetime |
Return Type
Returns an unsigned 64-bit integer (UInt64) in the format "YYYYMMDDHH".
SQL Examples
SELECT
to_yyyymmddhh('2023-11-12 09:38:18.165575')
┌─────────────────────────────────────────────┐
│ to_yyyymmddhh('2023-11-12 09:38:18.165575') │
│ UInt32 │
├─────────────────────────────────────────────┤
│ 2023111209 │
└─────────────────────────────────────────────┘
8.50 - TO_YYYYMMDDHHMMSS
Convert a date or date with time (timestamp/datetime) to a UInt64 number containing the year and month number (YYYY * 10000000000 + MM * 100000000 + DD * 1000000 + hh * 10000 + mm * 100 + ss).
Analyze Syntax
func.to_yyyymmddhhmmss(<expr>)
Analyze Examples
func.to_yyyymmddhhmmss('2023-11-12 09:38:18.165575')
┌──────────────────────────────────────────────────────┐
│ func.to_yyyymmddhhmmss('2023-11-12 09:38:18.165575') │
│ UInt64 │
├──────────────────────────────────────────────────────┤
│ 20231112092818 │
└──────────────────────────────────────────────────────┘
SQL Syntax
TO_YYYYMMDDHHMMSS(<expr>)
Arguments
Arguments | Description |
---|
<expr> | date/timestamp |
Return Type
BIGINT
, returns in YYYYMMDDhhmmss
format.
SQL Examples
SELECT
to_yyyymmddhhmmss('2023-11-12 09:38:18.165575')
┌─────────────────────────────────────────────────┐
│ to_yyyymmddhhmmss('2023-11-12 09:38:18.165575') │
│ UInt64 │
├─────────────────────────────────────────────────┤
│ 20231112092818 │
└─────────────────────────────────────────────────┘
8.51 - TODAY
Returns current date.
Analyze Syntax
Analyze Examples
func.today()
+--------------+
| func.today() |
+--------------+
| 2021-09-03 |
+--------------+
SQL Syntax
Return Type
DATE
, returns date in “YYYY-MM-DD” format.
SQL Examples
SELECT TODAY();
+------------+
| TODAY() |
+------------+
| 2021-09-03 |
+------------+
8.52 - TOMORROW
Returns tomorrow date, same as today() + 1
.
Analyze Syntax
Analyze Examples
func.tomorrow()
+-----------------+
| func.tomorrow() |
+-----------------+
| 2021-09-03 |
+-----------------+
SQL Syntax
Return Type
DATE
, returns date in “YYYY-MM-DD” format.
SQL Examples
SELECT TOMORROW();
+------------+
| TOMORROW() |
+------------+
| 2021-09-04 |
+------------+
SELECT TODAY()+1;
+---------------+
| (TODAY() + 1) |
+---------------+
| 2021-09-04 |
+---------------+
8.53 - TRY_TO_DATETIME
Alias for TRY_TO_TIMESTAMP.
8.54 - TRY_TO_TIMESTAMP
A variant of TO_TIMESTAMP in PlaidCloud Lakehouse that, while performing the same conversion of an input expression to a timestamp, incorporates error-handling support by returning NULL if the conversion fails instead of raising an error.
See also: TO_TIMESTAMP
Analyze Syntax
func.try_to_timestamp(<expr>)
Analyze Examples
func.try_to_timestamp('2022-01-02 02:00:11'), func.try_to_datetime('2022-01-02 02:00:11'), func.try_to_timestamp('plaidcloud')
┌──────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────┐
│ func.try_to_timestamp('2022-01-02 02:00:11') │ func.try_to_datetime('2022-01-02 02:00:11') │ func.try_to_timestamp('plaidcloud') │
│ Timestamp │ Timestamp │ │
├─────────────────────────────────────────┼──────────────────────────────────────────────────┤─────────────────────────────────────│
│ 2022-01-02 02:00:11 │ 2022-01-02 02:00:11 │ NULL │
└──────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────┘
SQL Syntax
-- Convert a string or integer to a timestamp
TRY_TO_TIMESTAMP(<expr>)
-- Convert a string to a timestamp using the given pattern
TRY_TO_TIMESTAMP(<expr, expr>)
Aliases
SQL Examples
SELECT TRY_TO_TIMESTAMP('2022-01-02 02:00:11'), TRY_TO_DATETIME('2022-01-02 02:00:11');
┌──────────────────────────────────────────────────────────────────────────────────┐
│ try_to_timestamp('2022-01-02 02:00:11') │ try_to_datetime('2022-01-02 02:00:11') │
│ Timestamp │ Timestamp │
├─────────────────────────────────────────┼────────────────────────────────────────┤
│ 2022-01-02 02:00:11 │ 2022-01-02 02:00:11 │
└──────────────────────────────────────────────────────────────────────────────────┘
SELECT TRY_TO_TIMESTAMP('databend'), TRY_TO_DATETIME('databend');
┌────────────────────────────────────────────────────────────┐
│ try_to_timestamp('databend') │ try_to_datetime('databend') │
├──────────────────────────────┼─────────────────────────────┤
│ NULL │ NULL │
└────────────────────────────────────────────────────────────┘
8.55 - WEEK
Alias for TO_WEEK_OF_YEAR.
8.56 - WEEKOFYEAR
Alias for TO_WEEK_OF_YEAR.
8.57 - YEAR
Alias for TO_YEAR.
8.58 - YESTERDAY
Returns yesterday date, same as today() - 1
.
Analyze Syntax
Analyze Examples
func.yesterday()
+------------------+
| func.yesterday() |
+------------------+
| 2021-09-02 |
+------------------+
SQL Syntax
Return Type
DATE
, returns date in “YYYY-MM-DD” format.
SQL Examples
SELECT YESTERDAY();
+-------------+
| YESTERDAY() |
+-------------+
| 2021-09-02 |
+-------------+
SELECT TODAY()-1;
+---------------+
| (TODAY() - 1) |
+---------------+
| 2021-09-02 |
+---------------+
9 - Geography Functions
This section provides reference information for the geography functions in PlaidCloud Lakehouse. These functions are based on the very innovate H3 system developed by Uber to better calculate geographic relationships. The explanation of H3 can be found here.
Coordinate Conversion
Hexagon Properties
Hexagon Relationships
Measurement
General Utility
9.1 - GEO_TO_H3
Returns the H3 index of the hexagon cell where the given location resides. Returning 0 means an error occurred.
Analyze Syntax
func.geo_to_h3(lon, lat, res)
Analyze Examples
func.geo_to_h3(37.79506683, 55.71290588, 15)
┌──────────────────────────────────────────────┐
│ func.geo_to_h3(37.79506683, 55.71290588, 15) │
├──────────────────────────────────────────────┤
│ 644325524701193974 │
└──────────────────────────────────────────────┘
SQL Syntax
SQL Examples
SELECT GEO_TO_H3(37.79506683, 55.71290588, 15);
┌─────────────────────────────────────────┐
│ geo_to_h3(37.79506683, 55.71290588, 15) │
├─────────────────────────────────────────┤
│ 644325524701193974 │
└─────────────────────────────────────────┘
9.2 - GEOHASH_DECODE
Converts a Geohash-encoded string into latitude/longitude coordinates.
Analyze Syntax
func.geohash_decode('<geohashed-string\>')
Analyze Examples
func.geohash_decode('ezs42')
┌─────────────────────────────────┐
│ func.geohash_decode('ezs42') │
├─────────────────────────────────┤
│ (-5.60302734375,42.60498046875) │
└─────────────────────────────────┘
SQL Syntax
GEOHASH_DECODE('<geohashed-string\>')
SQL Examples
SELECT GEOHASH_DECODE('ezs42');
┌─────────────────────────────────┐
│ geohash_decode('ezs42') │
├─────────────────────────────────┤
│ (-5.60302734375,42.60498046875) │
└─────────────────────────────────┘
9.3 - GEOHASH_ENCODE
Converts a pair of latitude and longitude coordinates into a Geohash-encoded string.
Analyze Syntax
func.geohash_encode(lon, lat)
Analyze Examples
func.geohash_encode(-5.60302734375, 42.593994140625)
┌─────────────────────────────────────────────────────────┐
│ func.geohash_encode((- 5.60302734375), 42.593994140625) │
├─────────────────────────────────────────────────────────┤
│ ezs42d000000 │
└─────────────────────────────────────────────────────────┘
SQL Syntax
SQL Examples
SELECT GEOHASH_ENCODE(-5.60302734375, 42.593994140625);
┌────────────────────────────────────────────────────┐
│ geohash_encode((- 5.60302734375), 42.593994140625) │
├────────────────────────────────────────────────────┤
│ ezs42d000000 │
└────────────────────────────────────────────────────┘
9.4 - H3_CELL_AREA_M2
Returns the exact area of specific cell in square meters.
Analyze Syntax
Analyze Examples
func.h3_cell_area_m2(599119489002373119)
┌──────────────────────────────────────────┐
│ func.h3_cell_area_m2(599119489002373119) │
├──────────────────────────────────────────┤
│ 127785582.60809991 │
└──────────────────────────────────────────┘
SQL Syntax
SQL Examples
SELECT H3_CELL_AREA_M2(599119489002373119);
┌─────────────────────────────────────┐
│ h3_cell_area_m2(599119489002373119) │
├─────────────────────────────────────┤
│ 127785582.60809991 │
└─────────────────────────────────────┘
9.5 - H3_CELL_AREA_RADS2
Returns the exact area of specific cell in square radians.
Analyze Syntax
func.h3_cell_area_rads2(h3)
Analyze Examples
func.h3_cell_area_rads2(599119489002373119)
┌─────────────────────────────────────────────┐
│ func.h3_cell_area_rads2(599119489002373119) │
├─────────────────────────────────────────────┤
│ 0.000003148224310427697 │
└─────────────────────────────────────────────┘
SQL Syntax
SQL Examples
SELECT H3_CELL_AREA_RADS2(599119489002373119);
┌────────────────────────────────────────┐
│ h3_cell_area_rads2(599119489002373119) │
├────────────────────────────────────────┤
│ 0.000003148224310427697 │
└────────────────────────────────────────┘
9.6 - H3_DISTANCE
Returns the grid distance between the the given two H3 indexes.
Note: H3 distance calculations can only calculate distances between hexes that are neighbors. Trying to use this with non-neighbor hexes will error.
Analyze Syntax
func.h3_distance(h3, a_h3)
Analyze Examples
func.h3_distance(599119489002373119, 599119491149856767)
┌──────────────────────────────────────────────────────────┐
│ func.h3_distance(599119489002373119, 599119491149856767) │
├──────────────────────────────────────────────────────────┤
│ 1 │
└──────────────────────────────────────────────────────────┘
SQL Syntax
SQL Examples
SELECT H3_DISTANCE(599119489002373119, 599119491149856767);
┌─────────────────────────────────────────────────────┐
│ h3_distance(599119489002373119, 599119491149856767) │
├─────────────────────────────────────────────────────┤
│ 1 │
└─────────────────────────────────────────────────────┘
9.7 - H3_EDGE_ANGLE
Returns the average length of the H3 hexagon edge in grades.
Analyze Syntax
Analyze Examples
func.h3_edge_angle(10)
┌────────────────────────────┐
│ func.h3_edge_angle(10) │
├────────────────────────────┤
│ 0.0006822586214153981 │
└────────────────────────────┘
SQL Syntax
SQL Examples
SELECT H3_EDGE_ANGLE(10);
┌───────────────────────┐
│ h3_edge_angle(10) │
├───────────────────────┤
│ 0.0006822586214153981 │
└───────────────────────┘
9.8 - H3_EDGE_LENGTH_KM
Returns the average hexagon edge length in kilometers at the given resolution. Excludes pentagons.
Analyze Syntax
func.h3_edge_length_km(res)
Analyze Examples
func.h3_edge_length_km(1)
┌───────────────────────────┐
│ func.h3_edge_length_km(1) │
├───────────────────────────┤
│ 483.0568390711111 │
└───────────────────────────┘
SQL Syntax
SQL Examples
SELECT H3_EDGE_LENGTH_KM(1);
┌──────────────────────┐
│ h3_edge_length_km(1) │
├──────────────────────┤
│ 483.0568390711111 │
└──────────────────────┘
9.9 - H3_EDGE_LENGTH_M
Returns the average hexagon edge length in meters at the given resolution. Excludes pentagons.
Analyze Syntax
Analyze Examples
func.h3_edge_length(1)
┌──────────────────────────┐
│ func.h3_edge_length_m(1) │
├──────────────────────────┤
│ 483056.8390711111 │
└──────────────────────────┘
SQL Syntax
SQL Examples
┌─────────────────────┐
│ h3_edge_length_m(1) │
├─────────────────────┤
│ 483056.8390711111 │
└─────────────────────┘
9.10 - H3_EXACT_EDGE_LENGTH_KM
Computes the length of this directed edge, in kilometers.
Analyze Syntax
func.h3_exact_edge_length_km(h3)
Analyze Examples
func.h3_exact_edge_length_km(1319695429381652479)
┌───────────────────────────────────────────────────┐
│ func.h3_exact_edge_length_km(1319695429381652479) │
├───────────────────────────────────────────────────┤
│ 8.267326832647143 │
└───────────────────────────────────────────────────┘
SQL Syntax
H3_EXACT_EDGE_LENGTH_KM(h3)
SQL Examples
SELECT H3_EXACT_EDGE_LENGTH_KM(1319695429381652479);
┌──────────────────────────────────────────────┐
│ h3_exact_edge_length_km(1319695429381652479) │
├──────────────────────────────────────────────┤
│ 8.267326832647143 │
└──────────────────────────────────────────────┘
9.11 - H3_EXACT_EDGE_LENGTH_M
Computes the length of this directed edge, in meters.
Analyze Syntax
func.h3_exact_edge_length_m(h3)
Analyze Examples
func.h3_exact_edge_length_m(1319695429381652479)
┌──────────────────────────────────────────────────┐
│ func.h3_exact_edge_length_m(1319695429381652479) │
├──────────────────────────────────────────────────┤
│ 8267.326832647143 │
└──────────────────────────────────────────────────┘
SQL Syntax
H3_EXACT_EDGE_LENGTH_M(h3)
SQL Examples
SELECT H3_EXACT_EDGE_LENGTH_M(1319695429381652479);
┌─────────────────────────────────────────────┐
│ h3_exact_edge_length_m(1319695429381652479) │
├─────────────────────────────────────────────┤
│ 8267.326832647143 │
└─────────────────────────────────────────────┘
9.12 - H3_EXACT_EDGE_LENGTH_RADS
Computes the length of this directed edge, in radians.
Analyze Syntax
func.h3_exact_edge_length_km(h3)
Analyze Examples
func.h3_exact_edge_length_km(1319695429381652479)
┌───────────────────────────────────────────────────┐
│ func.h3_exact_edge_length_km(1319695429381652479) │
├───────────────────────────────────────────────────┤
│ 8.267326832647143 │
└───────────────────────────────────────────────────┘
SQL Syntax
H3_EXACT_EDGE_LENGTH_RADS(h3)
SQL Examples
SELECT H3_EXACT_EDGE_LENGTH_KM(1319695429381652479);
┌──────────────────────────────────────────────┐
│ h3_exact_edge_length_km(1319695429381652479) │
├──────────────────────────────────────────────┤
│ 8.267326832647143 │
└──────────────────────────────────────────────┘
9.13 - H3_GET_BASE_CELL
Returns the base cell number of the given H3 index.
Analyze Syntax
func.h3_get_base_cell(h3)
Analyze Examples
func.h3_get_base_cell(644325524701193974)
┌───────────────────────────────────────────┐
│ func.h3_get_base_cell(644325524701193974) │
├───────────────────────────────────────────┤
│ 8 │
└───────────────────────────────────────────┘
SQL Syntax
SQL Examples
SELECT H3_GET_BASE_CELL(644325524701193974);
┌──────────────────────────────────────┐
│ h3_get_base_cell(644325524701193974) │
├──────────────────────────────────────┤
│ 8 │
└──────────────────────────────────────┘
9.14 - H3_GET_DESTINATION_INDEX_FROM_UNIDIRECTIONAL_EDGE
Returns the destination hexagon index from the unidirectional edge H3Index.
Analyze Syntax
func.h3_get_destination_index_from_unidirectional_edge(h3)
Analyze Examples
func.h3_get_destination_index_from_unidirectional_edge(1248204388774707199)
┌─────────────────────────────────────────────────────────────────────────────┐
│ func.h3_get_destination_index_from_unidirectional_edge(1248204388774707199) │
├─────────────────────────────────────────────────────────────────────────────┤
│ 599686043507097599 │
└─────────────────────────────────────────────────────────────────────────────┘
SQL Syntax
H3_GET_DESTINATION_INDEX_FROM_UNIDIRECTIONAL_EDGE(h3)
SQL Examples
SELECT H3_GET_DESTINATION_INDEX_FROM_UNIDIRECTIONAL_EDGE(1248204388774707199);
┌────────────────────────────────────────────────────────────────────────┐
│ h3_get_destination_index_from_unidirectional_edge(1248204388774707199) │
├────────────────────────────────────────────────────────────────────────┤
│ 599686043507097599 │
└────────────────────────────────────────────────────────────────────────┘
9.15 - H3_GET_FACES
Finds all icosahedron faces intersected by the given H3 index. Faces are represented as integers from 0-19.
Analyze Syntax
Analyze Examples
func.h3_get_faces(599119489002373119)
┌───────────────────────────────────────┐
│ func.h3_get_faces(599119489002373119) │
├───────────────────────────────────────┤
│ [0,1,2,3,4] │
└───────────────────────────────────────┘
SQL Syntax
SQL Examples
SELECT H3_GET_FACES(599119489002373119);
┌──────────────────────────────────┐
│ h3_get_faces(599119489002373119) │
├──────────────────────────────────┤
│ [0,1,2,3,4] │
└──────────────────────────────────┘
9.16 - H3_GET_INDEXES_FROM_UNIDIRECTIONAL_EDGE
Returns the origin and destination hexagon indexes from the given unidirectional edge H3Index.
Analyze Syntax
func.h3_get_indexes_from_unidirectional_edge(h3)
Analyze Examples
func.h3_get_indexes_from_unidirectional_edge(1248204388774707199)
┌────────────────────────────────────────────────────────────────────┐
│ func.h3_get_indexes_from_unidirectional_edge(1248204388774707199) │
├────────────────────────────────────────────────────────────────────┤
│ (599686042433355775,599686043507097599) │
└────────────────────────────────────────────────────────────────────┘
SQL Syntax
H3_GET_INDEXES_FROM_UNIDIRECTIONAL_EDGE(h3)
SQL Examples
SELECT H3_GET_INDEXES_FROM_UNIDIRECTIONAL_EDGE(1248204388774707199);
┌──────────────────────────────────────────────────────────────┐
│ h3_get_indexes_from_unidirectional_edge(1248204388774707199) │
├──────────────────────────────────────────────────────────────┤
│ (599686042433355775,599686043507097599) │
└──────────────────────────────────────────────────────────────┘
9.17 - H3_GET_ORIGIN_INDEX_FROM_UNIDIRECTIONAL_EDGE
Returns the origin hexagon index from the unidirectional edge H3Index.
Analyze Syntax
func.h3_get_origin_index_from_unidirectional_edge(h3)
Analyze Examples
func.h3_get_origin_index_from_unidirectional_edge(1248204388774707199)
┌────────────────────────────────────────────────────────────────────────┐
│ func.h3_get_origin_index_from_unidirectional_edge(1248204388774707199) │
├────────────────────────────────────────────────────────────────────────┤
│ 599686042433355775 │
└────────────────────────────────────────────────────────────────────────┘
SQL Syntax
H3_GET_ORIGIN_INDEX_FROM_UNIDIRECTIONAL_EDGE(h3)
SQL Examples
SELECT H3_GET_ORIGIN_INDEX_FROM_UNIDIRECTIONAL_EDGE(1248204388774707199);
┌───────────────────────────────────────────────────────────────────┐
│ h3_get_origin_index_from_unidirectional_edge(1248204388774707199) │
├───────────────────────────────────────────────────────────────────┤
│ 599686042433355775 │
└───────────────────────────────────────────────────────────────────┘
9.18 - H3_GET_RESOLUTION
Returns the resolution of the given H3 index.
Analyze Syntax
func.h3_get_resolution(h3)
Analyze Examples
func.h3_get_resolution(644325524701193974)
┌────────────────────────────────────────────┐
│ func.h3_get_resolution(644325524701193974) │
├────────────────────────────────────────────┤
│ 15 │
└────────────────────────────────────────────┘
SQL Syntax
SQL Examples
SELECT H3_GET_RESOLUTION(644325524701193974);
┌───────────────────────────────────────┐
│ h3_get_resolution(644325524701193974) │
├───────────────────────────────────────┤
│ 15 │
└───────────────────────────────────────┘
9.19 - H3_GET_UNIDIRECTIONAL_EDGE
Returns the edge between the given two H3 indexes.
Analyze Syntax
func.h3_get_unidirectional_edge(h3, a_h3)
Analyze Examples
func.h3_get_unidirectional_edge(644325524701193897, 644325524701193754)
┌─────────────────────────────────────────────────────────────────────────┐
│ func.h3_get_unidirectional_edge(644325524701193897, 644325524701193754) │
├─────────────────────────────────────────────────────────────────────────┤
│ 1581074247194257065 │
└─────────────────────────────────────────────────────────────────────────┘
SQL Syntax
H3_GET_UNIDIRECTIONAL_EDGE(h3, a_h3)
SQL Examples
SELECT H3_GET_UNIDIRECTIONAL_EDGE(644325524701193897, 644325524701193754);
┌────────────────────────────────────────────────────────────────────┐
│ h3_get_unidirectional_edge(644325524701193897, 644325524701193754) │
├────────────────────────────────────────────────────────────────────┤
│ 1581074247194257065 │
└────────────────────────────────────────────────────────────────────┘
9.20 - H3_GET_UNIDIRECTIONAL_EDGE_BOUNDARY
Returns the coordinates defining the unidirectional edge.
Analyze Syntax
func.h3_get_unidirectional_edge_boundary(h3)
Analyze Examples
func.h3_get_unidirectional_edge_boundary(1248204388774707199)
┌─────────────────────────────────────────────────────────────────────────────────┐
│ func.h3_get_unidirectional_edge_boundary(1248204388774707199) │
├─────────────────────────────────────────────────────────────────────────────────┤
│ [(37.42012867767778,-122.03773496427027),(37.33755608435298,-122.090428929044)] │
└─────────────────────────────────────────────────────────────────────────────────┘
SQL Syntax
H3_GET_UNIDIRECTIONAL_EDGE_BOUNDARY(h3)
SQL Examples
SELECT H3_GET_UNIDIRECTIONAL_EDGE_BOUNDARY(1248204388774707199);
┌─────────────────────────────────────────────────────────────────────────────────┐
│ h3_get_unidirectional_edge_boundary(1248204388774707199) │
├─────────────────────────────────────────────────────────────────────────────────┤
│ [(37.42012867767778,-122.03773496427027),(37.33755608435298,-122.090428929044)] │
└─────────────────────────────────────────────────────────────────────────────────┘
9.21 - H3_GET_UNIDIRECTIONAL_EDGES_FROM_HEXAGON
Returns all of the unidirectional edges from the provided H3Index.
Analyze Syntax
func.h3_get_unidirectional_edges_from_hexagon(h3)
Analyze Examples
func.h3_get_unidirectional_edges_from_hexagon(644325524701193754)
┌───────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────┐
│ func.h3_get_unidirectional_edges_from_hexagon(644325524701193754) │
├───────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────┤
│ [1292843871042545178,1364901465080473114,1436959059118401050,1509016653156328986,1581074247194256922,1653131841232184858] │
└───────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────┘
SQL Syntax
H3_GET_UNIDIRECTIONAL_EDGES_FROM_HEXAGON(h3)
SQL Examples
SELECT H3_GET_UNIDIRECTIONAL_EDGES_FROM_HEXAGON(644325524701193754);
┌───────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────┐
│ h3_get_unidirectional_edges_from_hexagon(644325524701193754) │
├───────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────┤
│ [1292843871042545178,1364901465080473114,1436959059118401050,1509016653156328986,1581074247194256922,1653131841232184858] │
└───────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────┘
9.22 - H3_HEX_AREA_KM2
Returns the average hexagon area in square kilometers at the given resolution. Excludes pentagons.
Analyze Syntax
Analyze Examples
func.h3_area_km2(1)
┌─────────────────────────┐
│ func.h3_hex_area_km2(1) │
├─────────────────────────┤
│ 609788.4417941332 │
└─────────────────────────┘
SQL Syntax
SQL Examples
SELECT H3_HEX_AREA_KM2(1);
┌────────────────────┐
│ h3_hex_area_km2(1) │
├────────────────────┤
│ 609788.4417941332 │
└────────────────────┘
9.23 - H3_HEX_AREA_M2
Returns the average hexagon area in square meters at the given resolution. Excludes pentagons.
Analyze Syntax
Analyze Examples
func.h3_hex_area_m2(1)
┌────────────────────────┐
│ func.h3_hex_area_m2(1) │
├────────────────────────┤
│ 609788441794.1339 │
└────────────────────────┘
SQL Syntax
SQL Examples
SELECT H3_HEX_AREA_M2(1);
┌───────────────────┐
│ h3_hex_area_m2(1) │
├───────────────────┤
│ 609788441794.1339 │
└───────────────────┘
9.24 - H3_HEX_RING
Returns the "hollow" ring of hexagons at exactly grid distance k
from the given H3 index.
Analyze Syntax
Analyze Examples
func.h3_hex_ring(599686042433355775, 2)
┌───────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────┐
│ func.h3_hex_ring(599686042433355775, 2) │
├───────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────┤
│ [599686018811035647,599686034917163007,599686029548453887,599686032769679359,599686198125920255,599686040285872127,599686041359613951,599686039212130303,599686023106002943,599686027400970239,599686013442326527,599686012368584703] │
└───────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────┘
SQL Syntax
SQL Examples
SELECT H3_HEX_RING(599686042433355775, 2);
┌───────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────┐
│ h3_hex_ring(599686042433355775, 2) │
├───────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────┤
│ [599686018811035647,599686034917163007,599686029548453887,599686032769679359,599686198125920255,599686040285872127,599686041359613951,599686039212130303,599686023106002943,599686027400970239,599686013442326527,599686012368584703] │
└───────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────┘
9.25 - H3_INDEXES_ARE_NEIGHBORS
Returns whether or not the provided H3 indexes are neighbors.
Analyze Syntax
func.h3_indexes_are_neighbors(h3, a_h3)
Analyze Examples
func.h3_indexes_are_neighbors(644325524701193974, 644325524701193897)
┌───────────────────────────────────────────────────────────────────────┐
│ func.h3_indexes_are_neighbors(644325524701193974, 644325524701193897) │
├───────────────────────────────────────────────────────────────────────┤
│ true │
└───────────────────────────────────────────────────────────────────────┘
SQL Syntax
H3_INDEXES_ARE_NEIGHBORS(h3, a_h3)
SQL Examples
SELECT H3_INDEXES_ARE_NEIGHBORS(644325524701193974, 644325524701193897);
┌──────────────────────────────────────────────────────────────────┐
│ h3_indexes_are_neighbors(644325524701193974, 644325524701193897) │
├──────────────────────────────────────────────────────────────────┤
│ true │
└──────────────────────────────────────────────────────────────────┘
9.26 - H3_IS_PENTAGON
Checks if the given H3 index represents a pentagonal cell.
Analyze Syntax
Analyze Examples
func.h3_is_pentagon(599119489002373119)
┌─────────────────────────────────────────┐
│ func.h3_is_pentagon(599119489002373119) │
├─────────────────────────────────────────┤
│ true │
└─────────────────────────────────────────┘
SQL Syntax
SQL Examples
SELECT H3_IS_PENTAGON(599119489002373119);
┌────────────────────────────────────┐
│ h3_is_pentagon(599119489002373119) │
├────────────────────────────────────┤
│ true │
└────────────────────────────────────┘
9.27 - H3_IS_RES_CLASS_III
Checks if the given H3 index has a resolution with Class III orientation.
Analyze Syntax
func.h3_is_res_class_iii(h3)
Analyze Examples
func.h3_is_res_class_iii(635318325446452991)
┌──────────────────────────────────────────────┐
│ func.h3_is_res_class_iii(635318325446452991) │
├──────────────────────────────────────────────┤
│ true │
└──────────────────────────────────────────────┘
SQL Syntax
SQL Examples
SELECT H3_IS_RES_CLASS_III(635318325446452991);
┌─────────────────────────────────────────┐
│ h3_is_res_class_iii(635318325446452991) │
├─────────────────────────────────────────┤
│ true │
└─────────────────────────────────────────┘
9.28 - H3_IS_VALID
Checks if the given H3 index is valid.
Analyze Syntax
Analyze Examples
func.h3_is_valid(644325524701193974)
┌──────────────────────────────────────┐
│ func.h3_is_valid(644325524701193974) │
├──────────────────────────────────────┤
│ true │
└──────────────────────────────────────┘
SQL Syntax
SQL Examples
SELECT H3_IS_VALID(644325524701193974);
┌─────────────────────────────────┐
│ h3_is_valid(644325524701193974) │
├─────────────────────────────────┤
│ true │
└─────────────────────────────────┘
9.29 - H3_K_RING
Returns an array containing the H3 indexes of the k-ring hexagons surrounding the input H3 index. Each element in this array is an H3 index.
Analyze Syntax
Analyze Examples
func.h3_k_ring(644325524701193974, 1)
┌────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────┐
│ func.h3_k_ring(644325524701193974, 1) │
├────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────┤
│ [644325524701193974,644325524701193899,644325524701193869,644325524701193970,644325524701193968,644325524701193972,644325524701193897] │
└────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────┘
SQL Syntax
SQL Examples
SELECT H3_K_RING(644325524701193974, 1);
┌────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────┐
│ h3_k_ring(644325524701193974, 1) │
├────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────┤
│ [644325524701193974,644325524701193899,644325524701193869,644325524701193970,644325524701193968,644325524701193972,644325524701193897] │
└────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────┘
9.30 - H3_LINE
Returns the line of indexes between the given two H3 indexes.
Analyze Syntax
Analyze Examples
func.h3_line(599119489002373119, 599119491149856767)
┌──────────────────────────────────────────────────────┐
│ func.h3_line(599119489002373119, 599119491149856767) │
├──────────────────────────────────────────────────────┤
│ [599119489002373119,599119491149856767] │
└──────────────────────────────────────────────────────┘
SQL Syntax
SQL Examples
SELECT H3_LINE(599119489002373119, 599119491149856767);
┌─────────────────────────────────────────────────┐
│ h3_line(599119489002373119, 599119491149856767) │
├─────────────────────────────────────────────────┤
│ [599119489002373119,599119491149856767] │
└─────────────────────────────────────────────────┘
9.31 - H3_NUM_HEXAGONS
Returns the number of unique H3 indexes at the given resolution.
Analyze Syntax
func.h3_num_hexagons(res)
Analyze Examples
func.h3_num_hexagons(10)
┌──────────────────────────┐
│ func.h3_num_hexagons(10) │
├──────────────────────────┤
│ 33897029882 │
└──────────────────────────┘
SQL Syntax
SQL Examples
SELECT H3_NUM_HEXAGONS(10);
┌─────────────────────┐
│ h3_num_hexagons(10) │
├─────────────────────┤
│ 33897029882 │
└─────────────────────┘
9.32 - H3_TO_CENTER_CHILD
Returns the center child index at the specified resolution.
Analyze Syntax
func.h3_to_center_child(h3, res)
Analyze Examples
func.h3_to_center_child(599119489002373119, 15)
┌─────────────────────────────────────────────────┐
│ func.h3_to_center_child(599119489002373119, 15) │
├─────────────────────────────────────────────────┤
│ 644155484202336256 │
└─────────────────────────────────────────────────┘
SQL Syntax
H3_TO_CENTER_CHILD(h3, res)
SQL Examples
SELECT H3_TO_CENTER_CHILD(599119489002373119, 15);
┌────────────────────────────────────────────┐
│ h3_to_center_child(599119489002373119, 15) │
├────────────────────────────────────────────┤
│ 644155484202336256 │
└────────────────────────────────────────────┘
9.33 - H3_TO_CHILDREN
Returns the indexes contained by h3
at resolution child_res
.
Analyze Syntax
func.h3_to_children(h3, child_res)
Analyze Examples
func.h3_to_children(635318325446452991, 14)
┌────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────┐
│ func.h3_to_children(635318325446452991, 14) │
├────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────┤
│ [639821925073823431,639821925073823439,639821925073823447,639821925073823455,639821925073823463,639821925073823471,639821925073823479] │
└────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────┘
SQL Syntax
H3_TO_CHILDREN(h3, child_res)
SQL Examples
SELECT H3_TO_CHILDREN(635318325446452991, 14);
┌────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────┐
│ h3_to_children(635318325446452991, 14) │
├────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────┤
│ [639821925073823431,639821925073823439,639821925073823447,639821925073823455,639821925073823463,639821925073823471,639821925073823479] │
└────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────┘
9.34 - H3_TO_GEO
Returns the longitude and latitude corresponding to the given H3 index.
Analyze Syntax
Analyze Examples
func.h3_to_geo(644325524701193974)
┌────────────────────────────────────────┐
│ func.h3_to_geo(644325524701193974) │
├────────────────────────────────────────┤
│ (37.79506616830255,55.712902431456676) │
└────────────────────────────────────────┘
SQL Syntax
SQL Examples
SELECT H3_TO_GEO(644325524701193974);
┌────────────────────────────────────────┐
│ h3_to_geo(644325524701193974) │
├────────────────────────────────────────┤
│ (37.79506616830255,55.712902431456676) │
└────────────────────────────────────────┘
9.35 - H3_TO_GEO_BOUNDARY
Returns an array containing the longitude and latitude coordinates of the vertices of the hexagon corresponding to the H3 index.
Analyze Syntax
func.h3_to_geo_boundary(h3)
Analyze Examples
func.h3_to_geo_boundary(644325524701193974)
┌───────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────┐
│ func.h3_to_geo_boundary(644325524701193974) │
├───────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────┤
│ [(37.79505811173477,55.712900225355526),(37.79506506997187,55.71289713485417),(37.795073126539855,55.71289934095484),(37.795074224871684,55.71290463755745),(37.79506726663349,55.71290772805916),(37.79505921006456,55.712905521957914)] │
└───────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────┘
SQL Syntax
SQL Examples
SELECT H3_TO_GEO_BOUNDARY(644325524701193974);
┌───────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────┐
│ h3_to_geo_boundary(644325524701193974) │
├───────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────┤
│ [(37.79505811173477,55.712900225355526),(37.79506506997187,55.71289713485417),(37.795073126539855,55.71289934095484),(37.795074224871684,55.71290463755745),(37.79506726663349,55.71290772805916),(37.79505921006456,55.712905521957914)] │
└───────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────┘
9.36 - H3_TO_PARENT
Returns the parent index containing the h3
at resolution parent_res
. Returning 0 means an error occurred.
Analyze Syntax
func.h3_to_parent(h3, parent_res)
Analyze Examples
func.h3_to_parent(635318325446452991, 12)
┌───────────────────────────────────────────┐
│ func.h3_to_parent(635318325446452991, 12) │
├───────────────────────────────────────────┤
│ 630814725819082751 │
└───────────────────────────────────────────┘
SQL Syntax
H3_TO_PARENT(h3, parent_res)
SQL Examples
SELECT H3_TO_PARENT(635318325446452991, 12);
┌──────────────────────────────────────┐
│ h3_to_parent(635318325446452991, 12) │
├──────────────────────────────────────┤
│ 630814725819082751 │
└──────────────────────────────────────┘
9.37 - H3_TO_STRING
Converts the representation of the given H3 index to the string representation.
Analyze Syntax
Analyze Examples
func.h3_to_string(635318325446452991)
┌───────────────────────────────────────┐
│ func.h3_to_string(635318325446452991) │
├───────────────────────────────────────┤
│ 8d11aa6a38826ff │
└───────────────────────────────────────┘
SQL Syntax
SQL Examples
SELECT H3_TO_STRING(635318325446452991);
┌──────────────────────────────────┐
│ h3_to_string(635318325446452991) │
├──────────────────────────────────┤
│ 8d11aa6a38826ff │
└──────────────────────────────────┘
9.38 - H3_UNIDIRECTIONAL_EDGE_IS_VALID
Determines if the provided H3Index is a valid unidirectional edge index. Returns 1 if it's a unidirectional edge and 0 otherwise.
Analyze Syntax
func.h3_unidirectional_edge_is_valid(h3)
Analyze Examples
func.h3_unidirectional_edge_is_valid(1248204388774707199)
┌───────────────────────────────────────────────────────────┐
│ func.h3_unidirectional_edge_is_valid(1248204388774707199) │
├───────────────────────────────────────────────────────────┤
│ true │
└───────────────────────────────────────────────────────────┘
SQL Syntax
H3_UNIDIRECTIONAL_EDGE_IS_VALID(h3)
SQL Examples
SELECT H3_UNIDIRECTIONAL_EDGE_IS_VALID(1248204388774707199);
┌──────────────────────────────────────────────────────┐
│ h3_unidirectional_edge_is_valid(1248204388774707199) │
├──────────────────────────────────────────────────────┤
│ true │
└──────────────────────────────────────────────────────┘
9.39 - POINT_IN_POLYGON
Calculates whether a given point falls within the polygon formed by joining multiple points. A polygon is a closed shape connected by coordinate pairs in the order they appear. Changing the order of coordinate pairs can result in a different shape.
Analyze Syntax
func.point_in_polygon((x,y), [(a,b), (c,d), (e,f) ... ])
Analyze Examples
func.point_in_polygon((3., 3.), [(6, 0), (8, 4), (5, 8), (0, 2)])
┌─────────────────────────────────────────────────────────────────┐
│ func.point_in_polygon((3, 3), [(6, 0), (8, 4), (5, 8), (0, 2)]) │
├─────────────────────────────────────────────────────────────────┤
│ 1 │
└─────────────────────────────────────────────────────────────────┘
SQL Syntax
POINT_IN_POLYGON((x,y), [(a,b), (c,d), (e,f) ... ])
SQL Examples
SELECT POINT_IN_POLYGON((3., 3.), [(6, 0), (8, 4), (5, 8), (0, 2)]);
┌────────────────────────────────────────────────────────────┐
│ point_in_polygon((3, 3), [(6, 0), (8, 4), (5, 8), (0, 2)]) │
├────────────────────────────────────────────────────────────┤
│ 1 │
└────────────────────────────────────────────────────────────┘
9.40 - STRING_TO_H3
Converts the string representation to H3 (uint64) representation.
Analyze Syntax
Analyze Examples
func.string_to_h3('8d11aa6a38826ff')
┌──────────────────────────────────────┐
│ func.string_to_h3('8d11aa6a38826ff') │
├──────────────────────────────────────┤
│ 635318325446452991 │
└──────────────────────────────────────┘
SQL Syntax
SQL Examples
SELECT STRING_TO_H3('8d11aa6a38826ff');
┌─────────────────────────────────┐
│ string_to_h3('8d11aa6a38826ff') │
├─────────────────────────────────┤
│ 635318325446452991 │
└─────────────────────────────────┘
10 - Hash Functions
This section provides reference information for the Hash functions in PlaidCloud Lakehouse.
10.1 - BLAKE3
Calculates a BLAKE3 256-bit checksum for a string. The value is returned as a string of 64 hexadecimal digits or NULL if the argument was NULL.
Analyze Syntax
Analyze Examples
func.blake3('1234567890')
+------------------------------------------------------------------+
| func.blake3('1234567890') |
+------------------------------------------------------------------+
| d12e417e04494572b561ba2c12c3d7f9e5107c4747e27b9a8a54f8480c63e841 |
+------------------------------------------------------------------+
SQL Syntax
SQL Examples
SELECT BLAKE3('1234567890');
┌──────────────────────────────────────────────────────────────────┐
│ blake3('1234567890') │
├──────────────────────────────────────────────────────────────────┤
│ d12e417e04494572b561ba2c12c3d7f9e5107c4747e27b9a8a54f8480c63e841 │
└──────────────────────────────────────────────────────────────────┘
10.2 - CITY64WITHSEED
Calculates a City64WithSeed 64-bit hash for a string.
Analyze Syntax
func.city64withseed(<expr1>, <expr2>)
Analyze Examples
func.city64withseed('1234567890', 12)
+---------------------------------------+
| func.city64withseed('1234567890', 12) |
+---------------------------------------+
| 10660895976650300430 |
+---------------------------------------+
SQL Syntax
CITY64WITHSEED(<expr1>, <expr2>)
SQL Examples
SELECT CITY64WITHSEED('1234567890', 12);
┌──────────────────────────────────┐
│ city64withseed('1234567890', 12) │
├──────────────────────────────────┤
│ 10660895976650300430 │
└──────────────────────────────────┘
10.3 - MD5
Calculates an MD5 128-bit checksum for a string. The value is returned as a string of 32 hexadecimal digits or NULL if the argument was NULL.
Analyze Syntax
Analyze Examples
func.md5('1234567890')
+------------------------------------------+
| func.md5('1234567890') |
+------------------------------------------+
| e807f1fcf82d132f9bb018ca6738a19f |
+------------------------------------------+
SQL Syntax
SQL Examples
SELECT MD5('1234567890');
┌──────────────────────────────────┐
│ md5('1234567890') │
├──────────────────────────────────┤
│ e807f1fcf82d132f9bb018ca6738a19f │
└──────────────────────────────────┘
10.4 - SHA
Calculates an SHA-1 160-bit checksum for the string, as described in RFC 3174 (Secure Hash Algorithm). The value is returned as a string of 40 hexadecimal digits or NULL if the argument was NULL.
Analyze Syntax
Analyze Examples
func.sha('1234567890')
+------------------------------------------+
| func.sha('1234567890') |
+------------------------------------------+
| 01b307acba4f54f55aafc33bb06bbbf6ca803e9a |
+------------------------------------------+
SQL Syntax
Aliases
SQL Examples
SELECT SHA('1234567890'), SHA1('1234567890');
┌─────────────────────────────────────────────────────────────────────────────────────┐
│ sha('1234567890') │ sha1('1234567890') │
├──────────────────────────────────────────┼──────────────────────────────────────────┤
│ 01b307acba4f54f55aafc33bb06bbbf6ca803e9a │ 01b307acba4f54f55aafc33bb06bbbf6ca803e9a │
└─────────────────────────────────────────────────────────────────────────────────────┘
10.5 - SHA1
Alias for SHA.
10.6 - SHA2
Calculates the SHA-2 family of hash functions (SHA-224, SHA-256, SHA-384, and SHA-512). If the hash length is not one of the permitted values, the return value is NULL. Otherwise, the function result is a hash value containing the desired number of bits as a string of hexadecimal digits.
Analyze Syntax
func.sha2(<expr>, <expr>)
Analyze Examples
func.sha2('1234567890', 0)
+------------------------------------------------------------------+
| func.sha2('1234567890', 0)) |
+------------------------------------------------------------------+
| c775e7b757ede630cd0aa1113bd102661ab38829ca52a6422ab782862f268646 |
+------------------------------------------------------------------+
SQL Syntax
SQL Examples
SELECT SHA2('1234567890', 0);
┌──────────────────────────────────────────────────────────────────┐
│ sha2('1234567890', 0) │
├──────────────────────────────────────────────────────────────────┤
│ c775e7b757ede630cd0aa1113bd102661ab38829ca52a6422ab782862f268646 │
└──────────────────────────────────────────────────────────────────┘
10.7 - SIPHASH
Alias for SIPHASH64.
10.8 - SIPHASH64
Produces a 64-bit SipHash hash value.
Analyze Syntax
Analyze Examples
func.siphash64('1234567890')
+-------------------------------+
| func.siphash64('1234567890') |
+-------------------------------+
| 18110648197875983073 |
+-------------------------------+
SQL Syntax
Aliases
SQL Examples
SELECT SIPHASH('1234567890'), SIPHASH64('1234567890');
┌─────────────────────────────────────────────────┐
│ siphash('1234567890') │ siphash64('1234567890') │
├───────────────────────┼─────────────────────────┤
│ 18110648197875983073 │ 18110648197875983073 │
└─────────────────────────────────────────────────┘
10.9 - XXHASH32
Calculates an xxHash32 32-bit hash value for a string. The value is returned as a UInt32 or NULL if the argument was NULL.
Analyze Syntax
Analyze Examples
func.xxhash32('1234567890')
+-----------------------------+
| func.xxhash32('1234567890') |
+-----------------------------+
| 3896585587 |
+-----------------------------+
SQL Syntax
SQL Examples
SELECT XXHASH32('1234567890');
┌────────────────────────┐
│ xxhash32('1234567890') │
├────────────────────────┤
│ 3896585587 │
└────────────────────────┘
10.10 - XXHASH64
Calculates an xxHash64 64-bit hash value for a string. The value is returned as a UInt64 or NULL if the argument was NULL.
Analyze Syntax
Analyze Examples
func.xxhash64('1234567890')
+-----------------------------+
| func.xxhash64('1234567890') |
+-----------------------------+
| 12237639266330420150 |
+-----------------------------+
SQL Syntax
SQL Examples
SELECT XXHASH64('1234567890');
┌────────────────────────┐
│ xxhash64('1234567890') │
├────────────────────────┤
│ 12237639266330420150 │
└────────────────────────┘
11 - IP Address Functions
This section provides reference information for the IP address-related functions in PlaidCloud Lakehouse.
11.1 - INET_ATON
Converts an IPv4 address to a 32-bit integer.
Analyze Syntax
Analyze Examples
func.inet_aton('1.2.3.4')
┌───────────────────────────────┐
│ func.inet_aton('1.2.3.4') │
├───────────────────────────────┤
│ 16909060 │
└───────────────────────────────┘
SQL Syntax
Aliases
Return Type
Integer.
SQL Examples
SELECT IPV4_STRING_TO_NUM('1.2.3.4'), INET_ATON('1.2.3.4');
┌──────────────────────────────────────────────────────┐
│ ipv4_string_to_num('1.2.3.4') │ inet_aton('1.2.3.4') │
├───────────────────────────────┼──────────────────────┤
│ 16909060 │ 16909060 │
└──────────────────────────────────────────────────────┘
11.2 - INET_NTOA
Converts a 32-bit integer to an IPv4 address.
Analyze Syntax
Analyze Examples
SELECT func.inet_ntoa(16909060)
┌──────────────────────────────┐
│ func.inet_ntoa(16909060) │
├──────────────────────────────┤
│ 1.2.3.4 │
└──────────────────────────────┘
SQL Syntax
Aliases
Return Type
String.
SQL Examples
SELECT IPV4_NUM_TO_STRING(16909060), INET_NTOA(16909060);
┌────────────────────────────────────────────────────┐
│ ipv4_num_to_string(16909060) │ inet_ntoa(16909060) │
├──────────────────────────────┼─────────────────────┤
│ 1.2.3.4 │ 1.2.3.4 │
└────────────────────────────────────────────────────┘
11.3 - IPV4_NUM_TO_STRING
Alias for INET_NTOA.
11.4 - IPV4_STRING_TO_NUM
Alias for INET_ATON.
11.5 - TRY_INET_ATON
try_inet_aton function is used to take the dotted-quad representation of an IPv4 address as a string and returns the numeric value of the given IP address in form of an integer.
Analyze Syntax
func.try_inet_aton(<str>)
Analyze Examples
func.try_inet_aton('10.0.5.9')
┌────────────────────────────────┐
│ func.try_inet_aton('10.0.5.9') │
├────────────────────────────────┤
│ 167773449 │
└────────────────────────────────┘
SQL Syntax
Aliases
Return Type
Integer.
SQL Examples
SELECT TRY_INET_ATON('10.0.5.9'), TRY_IPV4_STRING_TO_NUM('10.0.5.9');
┌────────────────────────────────────────────────────────────────┐
│ try_inet_aton('10.0.5.9') │ try_ipv4_string_to_num('10.0.5.9') │
│ UInt32 │ UInt32 │
├───────────────────────────┼────────────────────────────────────┤
│ 167773449 │ 167773449 │
└────────────────────────────────────────────────────────────────┘
11.6 - TRY_INET_NTOA
Takes an IPv4 address in network byte order and then returns the address as a dotted-quad string representation.
Analyze Syntax
func.try_inet_ntoa(<integer>)
Analyze Examples
func.try_inet_ntoaA(167773449)
┌───────────────────────────────┐
│ func.try_inet_ntoa(167773449) │
├───────────────────────────────┤
│ 10.0.5.9 │
└───────────────────────────────┘
SQL Syntax
TRY_INET_NTOA( <integer> )
Aliases
Return Type
String.
SQL Examples
SELECT TRY_INET_NTOA(167773449), TRY_IPV4_NUM_TO_STRING(167773449);
┌──────────────────────────────────────────────────────────────┐
│ try_inet_ntoa(167773449) │ try_ipv4_num_to_string(167773449) │
├──────────────────────────┼───────────────────────────────────┤
│ 10.0.5.9 │ 10.0.5.9 │
└──────────────────────────────────────────────────────────────┘
11.7 - TRY_IPV4_NUM_TO_STRING
Alias for TRY_INET_NTOA.
11.8 - TRY_IPV4_STRING_TO_NUM
Alias for TRY_INET_ATON.
12 - Numeric Functions
This section provides reference information for the numeric functions in PlaidCloud Lakehouse.
12.1 - ABS
Returns the absolute value of x
.
Analyze Syntax
Analyze Examples
func.abs((- 5))
┌─────────────────┐
│ func.abs((- 5)) │
├─────────────────┤
│ 5 │
└─────────────────┘
SQL Syntax
SQL Examples
SELECT ABS(-5);
┌────────────┐
│ abs((- 5)) │
├────────────┤
│ 5 │
└────────────┘
12.2 - ACOS
Returns the arc cosine of x
, that is, the value whose cosine is x
. Returns NULL if x
is not in the range -1 to 1.
Analyze Syntax
Analyze Examples
func.abs(1)
┌──────────────┐
│ func.acos(1) │
├──────────────┤
│ 0 │
└──────────────┘
SQL Syntax
SQL Examples
SELECT ACOS(1);
┌─────────┐
│ acos(1) │
├─────────┤
│ 0 │
└─────────┘
12.3 - ADD
Alias for PLUS.
12.4 - ASIN
Returns the arc sine of x
, that is, the value whose sine is x
. Returns NULL if x
is not in the range -1 to 1.
Analyze Syntax
Analyze Examples
func.asin(0.2)
┌────────────────────┐
│ func.asin(0.2) │
├────────────────────┤
│ 0.2013579207903308 │
└────────────────────┘
SQL Syntax
SQL Examples
SELECT ASIN(0.2);
┌────────────────────┐
│ asin(0.2) │
├────────────────────┤
│ 0.2013579207903308 │
└────────────────────┘
12.5 - ATAN
Returns the arc tangent of x
, that is, the value whose tangent is x
.
Analyze Syntax
Analyze Examples
func.atan(-2)
┌─────────────────────┐
│ func.atan((- 2)) │
├─────────────────────┤
│ -1.1071487177940906 │
└─────────────────────┘
SQL Syntax
SQL Examples
SELECT ATAN(-2);
┌─────────────────────┐
│ atan((- 2)) │
├─────────────────────┤
│ -1.1071487177940906 │
└─────────────────────┘
12.6 - ATAN2
Returns the arc tangent of the two variables x
and y
. It is similar to calculating the arc tangent of y
/ x
, except that the signs of both arguments are used to determine the quadrant of the result. ATAN(y, x)
is a synonym for ATAN2(y, x)
.
Analyze Syntax
Analyze Examples
func.atan2((- 2), 2)
┌─────────────────────┐
│ func.atan2((- 2), 2)│
├─────────────────────┤
│ -0.7853981633974483 │
└─────────────────────┘
SQL Syntax
SQL Examples
SELECT ATAN2(-2, 2);
┌─────────────────────┐
│ atan2((- 2), 2) │
├─────────────────────┤
│ -0.7853981633974483 │
└─────────────────────┘
12.7 - CBRT
Returns the cube root of a nonnegative number x
.
Analyze Syntax
Analyze Examples
func.cbrt(27)
┌───────────────┐
│ func.cbrt(27) │
├───────────────┤
│ 3 │
└───────────────┘
SQL Syntax
SQL Examples
SELECT CBRT(27);
┌──────────┐
│ cbrt(27) │
├──────────┤
│ 3 │
└──────────┘
12.8 - CEIL
Rounds the number up.
Analyze Syntax
Analyze Examples
func.ceil((- 1.23))
┌─────────────────────┐
│ func.ceil((- 1.23)) │
├─────────────────────┤
│ -1 │
└─────────────────────┘
SQL Syntax
Aliases
SQL Examples
SELECT CEILING(-1.23), CEIL(-1.23);
┌────────────────────────────────────┐
│ ceiling((- 1.23)) │ ceil((- 1.23)) │
├───────────────────┼────────────────┤
│ -1 │ -1 │
└────────────────────────────────────┘
12.9 - CEILING
Alias for CEIL.
12.10 - COS
Returns the cosine of x
, where x
is given in radians.
Analyze Syntax
Analyze Examples
func.cos(func.pi())
┌─────────────────────┐
│ func.cos(func.pi()) │
├─────────────────────┤
│ -1 │
└─────────────────────┘
SQL Syntax
SQL Examples
SELECT COS(PI());
┌───────────┐
│ cos(pi()) │
├───────────┤
│ -1 │
└───────────┘
12.11 - COT
Returns the cotangent of x
, where x
is given in radians.
Analyze Syntax
Analyze Examples
func.cot(12)
┌─────────────────────┐
│ func.cot(12) │
├─────────────────────┤
│ -1.5726734063976895 │
└─────────────────────┘
SQL Syntax
SQL Examples
SELECT COT(12);
┌─────────────────────┐
│ cot(12) │
├─────────────────────┤
│ -1.5726734063976895 │
└─────────────────────┘
12.12 - CRC32
Returns the CRC32 checksum of x
, where 'x' is expected to be a string and (if possible) is treated as one if it is not.
Analyze Syntax
Analyze Examples
func.crc32('databend')
┌────────────────────────┐
│ func.crc32('databend') │
├────────────────────────┤
│ 1177678456 │
└────────────────────────┘
SQL Syntax
SQL Examples
SELECT CRC32('databend');
┌───────────────────┐
│ crc32('databend') │
├───────────────────┤
│ 1177678456 │
└───────────────────┘
12.13 - DEGREES
Returns the argument x
, converted from radians to degrees, where x
is given in radians.
Analyze Syntax
Analyze Examples
func.degrees(func.pi())
┌─────────────────────────┐
│ func.degrees(func.pi()) │
├─────────────────────────┤
│ 180 │
└─────────────────────────┘
SQL Syntax
SQL Examples
SELECT DEGREES(PI());
┌───────────────┐
│ degrees(pi()) │
├───────────────┤
│ 180 │
└───────────────┘
12.14 - DIV
Returns the quotient by dividing the first number by the second one, rounding down to the closest smaller integer. Equivalent to the division operator //
.
See also:
SQL Syntax
func.div(<numerator>, <denominator>)
Analyze Examples
# Equivalent to the division operator "//"
func.div(6.1, 2)
┌───────────────────────────────┐
│ func.div(6.1, 2) │ (6.1 // 2) │
├──────────────────┼────────────┤
│ 3 │ 3 │
└───────────────────────────────┘
# Error when divided by 0
error: APIError: ResponseError with 1006: divided by zero while evaluating function `div(6.1, 0)`
Analyze Syntax
Aliases
SQL Examples
-- Equivalent to the division operator "//"
SELECT 6.1 DIV 2, 6.1//2;
┌──────────────────────────┐
│ (6.1 div 2) │ (6.1 // 2) │
├─────────────┼────────────┤
│ 3 │ 3 │
└──────────────────────────┘
SELECT 6.1 DIV 2, INTDIV(6.1, 2), 6.1 DIV NULL;
┌───────────────────────────────────────────────┐
│ (6.1 div 2) │ intdiv(6.1, 2) │ (6.1 div null) │
├─────────────┼────────────────┼────────────────┤
│ 3 │ 3 │ NULL │
└───────────────────────────────────────────────┘
-- Error when divided by 0
root@localhost:8000/default> SELECT 6.1 DIV 0;
error: APIError: ResponseError with 1006: divided by zero while evaluating function `div(6.1, 0)`
12.15 - DIV0
import FunctionDescription from '@site/src/components/FunctionDescription';
Returns the quotient by dividing the first number by the second one. Returns 0 if the second number is 0.
See also:
Analyze Syntax
func.div0(<numerator>, <denominator>)
Analyze Examples
func.div0(20, 6), func.div0(20, 0), func.div0(20, null)
┌─────────────────────────────────────────────────────────────┐
│ func.div0(20, 6) │ func.div0(20, 0) │ func.div0(20, null) │
├────────────────────┼──────────────────┼─────────────────────┤
│ 3.3333333333333335 │ 0 │ NULL │
└─────────────────────────────────────────────────────────────┘
SQL Syntax
DIV0(<number1>, <number2>)
SQL Examples
SELECT
DIV0(20, 6),
DIV0(20, 0),
DIV0(20, NULL);
┌───────────────────────────────────────────────────┐
│ div0(20, 6) │ div0(20, 0) │ div0(20, null) │
├────────────────────┼─────────────┼────────────────┤
│ 3.3333333333333335 │ 0 │ NULL │
└───────────────────────────────────────────────────┘
12.16 - DIVNULL
import FunctionDescription from '@site/src/components/FunctionDescription';
Returns the quotient by dividing the first number by the second one. Returns NULL if the second number is 0 or NULL.
See also:
Analyze Syntax
func.divnull(<numerator>, <denominator>)
Analyze Examples
func.divnull(20, 6), func.divnull(20, 0), func.divnull(20, null)
┌───────────────────────────────────────────────────────────────────┐
│ func.divnull(20, 6)│ func.divnull(20, 0) │ func.divnull(20, null) │
├────────────────────┼─────────────────────┼────────────────────────┤
│ 3.3333333333333335 │ NULL │ NULL │
└───────────────────────────────────────────────────────────────────┘
SQL Syntax
DIVNULL(<number1>, <number2>)
SQL Examples
SELECT
DIVNULL(20, 6),
DIVNULL(20, 0),
DIVNULL(20, NULL);
┌─────────────────────────────────────────────────────────┐
│ divnull(20, 6) │ divnull(20, 0) │ divnull(20, null) │
├────────────────────┼────────────────┼───────────────────┤
│ 3.3333333333333335 │ NULL │ NULL │
└─────────────────────────────────────────────────────────┘
12.17 - EXP
Returns the value of e (the base of natural logarithms) raised to the power of x
.
Analyze Syntax
Analyze Examples
func.exp(2)
┌──────────────────┐
│ func.exp(2) │
├──────────────────┤
│ 7.38905609893065 │
└──────────────────┘
SQL Syntax
SQL Examples
SELECT EXP(2);
┌──────────────────┐
│ exp(2) │
├──────────────────┤
│ 7.38905609893065 │
└──────────────────┘
12.18 - FACTORIAL
Returns the factorial logarithm of x
. If x
is less than or equal to 0, the function returns 0.
Analyze Syntax
Analyze Examples
func.factorial(5)
┌───────────────────┐
│ func.factorial(5) │
├───────────────────┤
│ 120 │
└───────────────────┘
SQL Syntax
SQL Examples
SELECT FACTORIAL(5);
┌──────────────┐
│ factorial(5) │
├──────────────┤
│ 120 │
└──────────────┘
12.19 - FLOOR
Rounds the number down.
Analyze Syntax
Analyze Examples
func.floor(1.23)
┌──────────────────┐
│ func.floor(1.23) │
├──────────────────┤
│ 1 │
└──────────────────┘
SQL Syntax
SQL Examples
SELECT FLOOR(1.23);
┌─────────────┐
│ floor(1.23) │
├─────────────┤
│ 1 │
└─────────────┘
12.20 - INTDIV
Alias for DIV.
12.21 - LN
Returns the natural logarithm of x
; that is, the base-e logarithm of x
. If x is less than or equal to 0.0E0, the function returns NULL.
Analyze Syntax
Analyze Examples
func.ln(2)
┌────────────────────┐
│ func.ln(2) │
├────────────────────┤
│ 0.6931471805599453 │
└────────────────────┘
SQL Syntax
SQL Examples
SELECT LN(2);
┌────────────────────┐
│ ln(2) │
├────────────────────┤
│ 0.6931471805599453 │
└────────────────────┘
12.22 - LOG(b, x)
Returns the base-b logarithm of x
. If x
is less than or equal to 0.0E0, the function returns NULL.
Analyze Syntax
Analyze Examples
func.log(2, 65536)
┌────────────────────┐
│ func.log(2, 65536) │
├────────────────────┤
│ 16 │
└────────────────────┘
SQL Syntax
SQL Examples
SELECT LOG(2, 65536);
┌───────────────┐
│ log(2, 65536) │
├───────────────┤
│ 16 │
└───────────────┘
12.23 - LOG(x)
Returns the natural logarithm of x
. If x is less than or equal to 0.0E0, the function returns NULL.
Analyze Syntax
Analyze Examples
func.log(2)
┌────────────────────┐
│ func.log(2) │
├────────────────────┤
│ 0.6931471805599453 │
└────────────────────┘
SQL Syntax
SQL Examples
SELECT LOG(2);
┌────────────────────┐
│ log(2) │
├────────────────────┤
│ 0.6931471805599453 │
└────────────────────┘
12.24 - LOG10
Returns the base-10 logarithm of x
. If x
is less than or equal to 0.0E0, the function returns NULL.
Analyze Syntax
Analyze Examples
func.log10(100)
┌─────────────────┐
│ func.log10(100) │
├─────────────────┤
│ 2 │
└─────────────────┘
SQL Syntax
SQL Examples
SELECT LOG10(100);
┌────────────┐
│ log10(100) │
├────────────┤
│ 2 │
└────────────┘
12.25 - LOG2
Returns the base-2 logarithm of x
. If x
is less than or equal to 0.0E0, the function returns NULL.
Analyze Syntax
Analyze Examples
func.log2(65536)
┌──────────────────┐
│ func.log2(65536) │
├──────────────────┤
│ 16 │
└──────────────────┘
SQL Syntax
SQL Examples
SELECT LOG2(65536);
┌─────────────┐
│ log2(65536) │
├─────────────┤
│ 16 │
└─────────────┘
12.26 - MINUS
Negates a numeric value.
Analyze Syntax
Analyze Examples
func.minus(func.pi())
┌─────────────────────────┐
│ func.minus(func.pi()) │
├─────────────────────────┤
│ -3.141592653589793 │
└─────────────────────────┘
SQL Syntax
Aliases
SQL Examples
SELECT MINUS(PI()), NEG(PI()), NEGATE(PI()), SUBTRACT(PI());
┌───────────────────────────────────────────────────────────────────────────────────┐
│ minus(pi()) │ neg(pi()) │ negate(pi()) │ subtract(pi()) │
├────────────────────┼────────────────────┼────────────────────┼────────────────────┤
│ -3.141592653589793 │ -3.141592653589793 │ -3.141592653589793 │ -3.141592653589793 │
└───────────────────────────────────────────────────────────────────────────────────┘
12.27 - MOD
Alias for MODULO.
12.28 - MODULO
Returns the remainder of x
divided by y
. If y
is 0, it returns an error.
Analyze Syntax
Analyze Examples
func.modulo(9, 2)
┌───────────────────┐
│ func.modulo(9, 2) │
├───────────────────┤
│ 1 │
└───────────────────┘
SQL Syntax
Aliases
SQL Examples
SELECT MOD(9, 2), MODULO(9, 2);
┌──────────────────────────┐
│ mod(9, 2) │ modulo(9, 2) │
├───────────┼──────────────┤
│ 1 │ 1 │
└──────────────────────────┘
12.29 - NEG
Alias for MINUS.
12.30 - NEGATE
Alias for MINUS.
12.31 - PI
Returns the value of π as a floating-point value.
Analyze Syntax
Analyze Examples
func.pi()
┌───────────────────┐
│ func.pi() │
├───────────────────┤
│ 3.141592653589793 │
└───────────────────┘
SQL Syntax
SQL Examples
SELECT PI();
┌───────────────────┐
│ pi() │
├───────────────────┤
│ 3.141592653589793 │
└───────────────────┘
12.32 - PLUS
Calculates the sum of two numeric or decimal values.
Analyze Syntax
func.plus(<number1>, <number2>)
Analyze Examples
func.plus(1, 2.3)
┌────────────────────┐
│ func.plus(1, 2.3) │
├────────────────────┤
│ 3.3 │
└────────────────────┘
SQL Syntax
PLUS(<number1>, <number2>)
Aliases
SQL Examples
SELECT ADD(1, 2.3), PLUS(1, 2.3);
┌───────────────────────────────┐
│ add(1, 2.3) │ plus(1, 2.3) │
├───────────────┼───────────────┤
│ 3.3 │ 3.3 │
└───────────────────────────────┘
12.33 - POW
Returns the value of x
to the power of y
.
Analyze Syntax
Analyze Examples
func.pow(-2, 2)
┌────────────────────┐
│ func.pow((- 2), 2) │
├────────────────────┤
│ 4 │
└────────────────────┘
SQL Syntax
Aliases
SQL Examples
SELECT POW(-2, 2), POWER(-2, 2);
┌─────────────────────────────────┐
│ pow((- 2), 2) │ power((- 2), 2) │
├───────────────┼─────────────────┤
│ 4 │ 4 │
└─────────────────────────────────┘
12.34 - POWER
Alias for POW.
12.35 - RADIANS
Returns the argument x
, converted from degrees to radians.
Analyze Syntax
Analyze Examples
func.radians(90)
┌────────────────────┐
│ func.radians(90) │
├────────────────────┤
│ 1.5707963267948966 │
└────────────────────┘
SQL Syntax
SQL Examples
SELECT RADIANS(90);
┌────────────────────┐
│ radians(90) │
├────────────────────┤
│ 1.5707963267948966 │
└────────────────────┘
12.36 - RAND()
Returns a random floating-point value v in the range 0 <= v < 1.0. To obtain a random integer R in the range i <= R < j, use the expression FLOOR(i + RAND() * (j − i)).
Analyze Syntax
Analyze Examples
func.rand()
┌────────────────────┐
│ func.rand() │
├────────────────────┤
│ 0.5191511074382174 │
└────────────────────┘
SQL Syntax
SQL Examples
SELECT RAND();
┌────────────────────┐
│ rand() │
├────────────────────┤
│ 0.5191511074382174 │
└────────────────────┘
12.37 - RAND(n)
Returns a random floating-point value v in the range 0 <= v < 1.0. To obtain a random integer R in the range i <= R < j, use the expression FLOOR(i + RAND() * (j − i)). Argument n
is used as the seed value. For equal argument values, RAND(n) returns the same value each time , and thus produces a repeatable sequence of column values.
Analyze Syntax
Analyze Examples
func.rand(1)
┌────────────────────┐
│ func.rand(1) │
├────────────────────┤
│ 0.7133693869548766 │
└────────────────────┘
SQL Syntax
SQL Examples
SELECT RAND(1);
┌────────────────────┐
│ rand(1) │
├────────────────────┤
│ 0.7133693869548766 │
└────────────────────┘
12.38 - ROUND
Rounds the argument x to d decimal places. The rounding algorithm depends on the data type of x. d defaults to 0 if not specified. d can be negative to cause d digits left of the decimal point of the value x to become zero. The maximum absolute value for d is 30; any digits in excess of 30 (or -30) are truncated.
When using this function's result in calculations, be aware of potential precision issues due to its return data type being DOUBLE, which may affect final accuracy:
SELECT ROUND(4/7, 4) - ROUND(3/7, 4); -- Result: 0.14280000000000004
SELECT ROUND(4/7, 4)::DECIMAL(8,4) - ROUND(3/7, 4)::DECIMAL(8,4); -- Result: 0.1428
Analyze Syntax
Analyze Examples
func.round(0.123, 2)
┌──────────────────────┐
│ func.round(0.123, 2) │
├──────────────────────┤
│ 0.12 │
└──────────────────────┘
SQL Syntax
SQL Examples
SELECT ROUND(0.123, 2);
┌─────────────────┐
│ round(0.123, 2) │
├─────────────────┤
│ 0.12 │
└─────────────────┘
12.39 - SIGN
Returns the sign of the argument as -1, 0, or 1, depending on whether x
is negative, zero, or positive or NULL if the argument was NULL.
Analyze Syntax
Analyze Examples
func.sign(0)
┌──────────────┐
│ func.sign(0) │
├──────────────┤
│ 0 │
└──────────────┘
SQL Syntax
SQL Examples
SELECT SIGN(0);
┌─────────┐
│ sign(0) │
├─────────┤
│ 0 │
└─────────┘
12.40 - SIN
Returns the sine of x
, where x
is given in radians.
Analyze Syntax
Analyze Examples
func.sin(90)
┌────────────────────┐
│ func.sin(90) │
├────────────────────┤
│ 0.8939966636005579 │
└────────────────────┘
SQL Syntax
SQL Examples
SELECT SIN(90);
┌────────────────────┐
│ sin(90) │
├────────────────────┤
│ 0.8939966636005579 │
└────────────────────┘
12.41 - SQRT
Returns the square root of a nonnegative number x
. Returns Nan for negative input.
Analyze Syntax
Analyze Examples
func.sqrt(4)
┌──────────────┐
│ func.sqrt(4) │
├──────────────┤
│ 2 │
└──────────────┘
SQL Syntax
SQL Examples
SELECT SQRT(4);
┌─────────┐
│ sqrt(4) │
├─────────┤
│ 2 │
└─────────┘
12.42 - SUBTRACT
Alias for MINUS.
12.43 - TAN
Returns the tangent of x
, where x
is given in radians.
Analyze Syntax
Analyze Examples
func.tan(90)
┌────────────────────┐
│ func.tan(90) │
├────────────────────┤
│ -1.995200412208242 │
└────────────────────┘
SQL Syntax
SQL Examples
SELECT TAN(90);
┌────────────────────┐
│ tan(90) │
├────────────────────┤
│ -1.995200412208242 │
└────────────────────┘
12.44 - TRUNCATE
Returns the number x
, truncated to d
decimal places. If d
is 0, the result has no decimal point or fractional part. d
can be negative to cause d
digits left of the decimal point of the value x
to become zero. The maximum absolute value for d
is 30; any digits in excess of 30 (or -30) are truncated.
Analyze Syntax
Analyze Examples
func.truncate(1.223, 1)
┌─────────────────────────┐
│ func.truncate(1.223, 1) │
├─────────────────────────┤
│ 1.2 │
└─────────────────────────┘
SQL Syntax
SQL Examples
SELECT TRUNCATE(1.223, 1);
┌────────────────────┐
│ truncate(1.223, 1) │
├────────────────────┤
│ 1.2 │
└────────────────────┘
13 - Other Functions
Type Conversion Functions
Utility Functions
Others
13.1 - ASSUME_NOT_NULL
Results in an equivalent non-Nullable
value for a Nullable type. In case the original value is NULL
the result is undetermined.
Analyze Syntax
func.assume_not_null(<x>)
Analyze Examples
With a table like:
┌────────────────────┐
│ x │ y │
├────────────────────┤
│ 1 │ NULL │
│ 2 │ 3 │
└────────────────────┘
func.assume_not_null(y)
┌─────────────────────────┐
│ func.assume_not_null(y) │
├─────────────────────────┤
│ 0 │
│ 3 │
└─────────────────────────┘
SQL Syntax
Aliases
Return Type
Returns the original datatype from the non-Nullable
type; Returns the embedded non-Nullable
datatype for Nullable
type.
SQL Examples
CREATE TABLE default.t_null ( x int, y int null);
INSERT INTO default.t_null values (1, null), (2, 3);
SELECT ASSUME_NOT_NULL(y), REMOVE_NULLABLE(y) FROM t_null;
┌─────────────────────────────────────────┐
│ assume_not_null(y) │ remove_nullable(y) │
├────────────────────┼────────────────────┤
│ 0 │ 0 │
│ 3 │ 3 │
└─────────────────────────────────────────┘
13.2 - EXISTS
The exists condition is used in combination with a subquery and is considered "to be met" if the subquery returns at least one row.
SQL Syntax
WHERE EXISTS ( <subquery> );
SQL Examples
SELECT number FROM numbers(5) AS A WHERE exists (SELECT * FROM numbers(3) WHERE number=1);
+--------+
| number |
+--------+
| 0 |
| 1 |
| 2 |
| 3 |
| 4 |
+--------+
13.3 - GROUPING
Returns a bit mask indicating which GROUP BY
expressions are not included in the current grouping set. Bits are assigned with the rightmost argument corresponding to the least-significant bit; each bit is 0 if the corresponding expression is included in the grouping criteria of the grouping set generating the current result row, and 1 if it is not included.
SQL Syntax
GROUPING ( expr [, expr, ...] )
Note: GROUPING
can only be used with GROUPING SETS
, ROLLUP
, or CUBE
, and its arguments must be in the grouping sets list.
Arguments
Grouping sets items.
Return Type
UInt32.
SQL Examples
select a, b, grouping(a), grouping(b), grouping(a,b), grouping(b,a) from t group by grouping sets ((a,b),(a),(b), ()) ;
+------+------+-------------+-------------+----------------+----------------+
| a | b | grouping(a) | grouping(b) | grouping(a, b) | grouping(b, a) |
+------+------+-------------+-------------+----------------+----------------+
| NULL | A | 1 | 0 | 2 | 1 |
| a | NULL | 0 | 1 | 1 | 2 |
| b | A | 0 | 0 | 0 | 0 |
| NULL | NULL | 1 | 1 | 3 | 3 |
| a | A | 0 | 0 | 0 | 0 |
| b | B | 0 | 0 | 0 | 0 |
| b | NULL | 0 | 1 | 1 | 2 |
| a | B | 0 | 0 | 0 | 0 |
| NULL | B | 1 | 0 | 2 | 1 |
+------+------+-------------+-------------+----------------+----------------+
13.4 - HUMANIZE_NUMBER
Returns a readable number.
Analyze Syntax
Analyze Examples
func.humanize_number(1000 * 1000)
+-------------------------------------+
| func.humanize_number((1000 * 1000)) |
+-------------------------------------+
| 1 million |
+-------------------------------------+
SQL Syntax
Arguments
Arguments | Description |
---|
x | The numerical size. |
Return Type
String.
SQL Examples
SELECT HUMANIZE_NUMBER(1000 * 1000)
+-------------------------+
| HUMANIZE_NUMBER((1000 * 1000)) |
+-------------------------+
| 1 million |
+-------------------------+
13.5 - HUMANIZE_SIZE
Returns the readable size with a suffix(KiB, MiB, etc).
Analyze Syntax
Analyze Examples
func.humanize_size(1024 * 1024)
+----------------------------------------+
| func.func.humanize_size((1024 * 1024)) |
+----------------------------------------+
| 1 MiB |
+----------------------------------------+
SQL Syntax
Arguments
Arguments | Description |
---|
x | The numerical size. |
Return Type
String.
SQL Examples
SELECT HUMANIZE_SIZE(1024 * 1024)
+-------------------------+
| HUMANIZE_SIZE((1024 * 1024)) |
+-------------------------+
| 1 MiB |
+-------------------------+
13.6 - IGNORE
By using insert ignore statement, the rows with invalid data that cause the error are ignored and the rows with valid data are inserted into the table.
SQL Syntax
INSERT ignore INTO TABLE(column_list)
VALUES( value_list),
( value_list),
...
13.7 - REMOVE_NULLABLE
Alias for ASSUME_NOT_NULL.
13.8 - TO_NULLABLE
Converts a value to its nullable equivalent.
When you apply this function to a value, it checks if the value is already able to hold NULL values or not. If the value is already able to hold NULL values, the function will return the value without making any changes.
However, if the value is not able to hold NULL values, the TO_NULLABLE function will modify the value to make it able to hold NULL values. It does this by wrapping the value in a structure that can hold NULL values, which means the value can now hold NULL values in the future.
Analyze Syntax
Analyze Examples
func.typeof(3), func.to_nullable(3), func.typeof(func.to_nullable(3))
func.typeof(3) | func.to_nullable(3) | func.typeof(func.to_nullable(3)) |
-----------------+---------------------+----------------------------------+
TINYINT UNSIGNED | 3 | TINYINT UNSIGNED NULL |
SQL Syntax
Arguments
Arguments | Description |
---|
x | The original value. |
Return Type
Returns a value of the same data type as the input value, but wrapped in a nullable container if the input value is not already nullable.
SQL Examples
SELECT typeof(3), TO_NULLABLE(3), typeof(TO_NULLABLE(3));
typeof(3) |to_nullable(3)|typeof(to_nullable(3))|
----------------+--------------+----------------------+
TINYINT UNSIGNED| 3|TINYINT UNSIGNED NULL |
13.9 - TYPEOF
TYPEOF function is used to return the name of a data type.
Analyze Syntax
Analyze Examples
func.typeof(1)
+------------------+
| func.typeof(1) |
+------------------+
| INT |
+------------------+
SQL Syntax
Arguments
Arguments | Description |
---|
<expr> | Any expression. This may be a column name, the result of another function, or a math operation. |
Return Type
String
SQL Examples
SELECT typeof(1::INT);
+------------------+
| typeof(1::Int32) |
+------------------+
| INT |
+------------------+
14 - Semi-Structured Functions
This section provides reference information for the semi-structured data functions in PlaidCloud Lakehouse.
JSON Parsing, Conversion & Type Checking:
JSON Query and Extraction:
JSON Data Manipulation:
Object Operations:
Type Conversion:
14.1 - AS_<type>
Strict casting VARIANT
values to other data types.
If the input data type is not VARIANT
, the output is NULL
.
If the type of value in the VARIANT
does not match the output value, the output is NULL
.
Analyze Syntax
func.as_boolean( <variant> )
func.as_integer( <variant> )
func.as_float( <variant> )
func.as_string( <variant> )
func.as_array( <variant> )
func.as_object( <variant> )
SQL Syntax
AS_BOOLEAN( <variant> )
AS_INTEGER( <variant> )
AS_FLOAT( <variant> )
AS_STRING( <variant> )
AS_ARRAY( <variant> )
AS_OBJECT( <variant> )
Arguments
Arguments | Description |
---|
<variant> | The VARIANT value |
Return Type
- AS_BOOLEAN: BOOLEAN
- AS_INTEGER: BIGINT
- AS_FLOAT: DOUBLE
- AS_STRING: VARCHAR
- AS_ARRAY: Variant contains Array
- AS_OBJECT: Variant contains Object
SQL Examples
SELECT as_boolean(parse_json('true'));
+--------------------------------+
| as_boolean(parse_json('true')) |
+--------------------------------+
| 1 |
+--------------------------------+
SELECT as_integer(parse_json('123'));
+-------------------------------+
| as_integer(parse_json('123')) |
+-------------------------------+
| 123 |
+-------------------------------+
SELECT as_float(parse_json('12.34'));
+-------------------------------+
| as_float(parse_json('12.34')) |
+-------------------------------+
| 12.34 |
+-------------------------------+
SELECT as_string(parse_json('"abc"'));
+--------------------------------+
| as_string(parse_json('"abc"')) |
+--------------------------------+
| abc |
+--------------------------------+
SELECT as_array(parse_json('[1,2,3]'));
+---------------------------------+
| as_array(parse_json('[1,2,3]')) |
+---------------------------------+
| [1,2,3] |
+---------------------------------+
SELECT as_object(parse_json('{"k":"v","a":"b"}'));
+--------------------------------------------+
| as_object(parse_json('{"k":"v","a":"b"}')) |
+--------------------------------------------+
| {"k":"v","a":"b"} |
+--------------------------------------------+
14.2 - CHECK_JSON
Checks the validity of a JSON document.
If the input string is a valid JSON document or a NULL
, the output is NULL
.
If the input cannot be translated to a valid JSON value, the output string contains the error message.
Analyze Syntax
Analyze Example
func.check_json('[1,2,3]');
+----------------------------+
| func.check_json('[1,2,3]') |
+----------------------------+
| NULL |
+----------------------------+
SQL Syntax
Arguments
Arguments | Description |
---|
<expr> | An expression of string type |
Return Type
String
SQL Examples
SELECT check_json('[1,2,3]');
+-----------------------+
| check_json('[1,2,3]') |
+-----------------------+
| NULL |
+-----------------------+
SELECT check_json('{"key":"val"}');
+-----------------------------+
| check_json('{"key":"val"}') |
+-----------------------------+
| NULL |
+-----------------------------+
SELECT check_json('{"key":');
+----------------------------------------------+
| check_json('{"key":') |
+----------------------------------------------+
| EOF while parsing a value at line 1 column 7 |
+----------------------------------------------+
14.3 - FLATTEN
import FunctionDescription from '@site/src/components/FunctionDescription';
Transforms nested JSON data into a tabular format, where each element or field is represented as a separate row.
SQL Syntax
[LATERAL] FLATTEN ( INPUT => <expr> [ , PATH => <expr> ]
[ , OUTER => TRUE | FALSE ]
[ , RECURSIVE => TRUE | FALSE ]
[ , MODE => 'OBJECT' | 'ARRAY' | 'BOTH' ] )
Parameter / Keyword | Description | Default |
---|
INPUT | Specifies the JSON or array data to flatten. | - |
PATH | Specifies the path to the array or object within the input data to flatten. | - |
OUTER | If set to TRUE, rows with zero results will still be included in the output, but the values in the KEY, INDEX, and VALUE columns of those rows will be set to NULL. | FALSE |
RECURSIVE | If set to TRUE, the function will continue to flatten nested elements. | FALSE |
MODE | Controls whether to flatten only objects ('OBJECT'), only arrays ('ARRAY'), or both ('BOTH'). | 'BOTH' |
LATERAL | LATERAL is an optional keyword used to reference columns defined to the left of the LATERAL keyword within the FROM clause. LATERAL enables cross-referencing between the preceding table expressions and the function. | - |
Output
The following table describes the output columns of the FLATTEN function:
Note: When using the LATERAL keyword with FLATTEN, these output columns may not be explicitly provided, as LATERAL introduces dynamic cross-referencing, altering the output structure.
Column | Description |
---|
SEQ | A unique sequence number associated with the input. |
KEY | Key to the expanded value. If the flattened element does not contain a key, it's set to NULL. |
PATH | Path to the flattened element. |
INDEX | If the element is an array, this column contains its index; otherwise, it's set to NULL. |
VALUE | Value of the flattened element. |
THIS | This column identifies the element currently being flattened. |
SQL Examples
SQL Examples 1: Demonstrating PATH, OUTER, RECURSIVE, and MODE Parameters
This example demonstrates the behavior of the FLATTEN function with respect to the PATH, OUTER, RECURSIVE, and MODE parameters.
SELECT
*
FROM
FLATTEN (
INPUT => PARSE_JSON (
'{"name": "John", "languages": ["English", "Spanish", "French"], "address": {"city": "New York", "state": "NY"}}'
)
);
┌────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────┐
│ seq │ key │ path │ index │ value │ this │
├────────┼──────────────────┼──────────────────┼──────────────────┼──────────────────────────────────┼───────────────────────────────────────────────────────────────────────────────────────────────────────┤
│ 1 │ address │ address │ NULL │ {"city":"New York","state":"NY"} │ {"address":{"city":"New York","state":"NY"},"languages":["English","Spanish","French"],"name":"John"} │
│ 1 │ languages │ languages │ NULL │ ["English","Spanish","French"] │ {"address":{"city":"New York","state":"NY"},"languages":["English","Spanish","French"],"name":"John"} │
│ 1 │ name │ name │ NULL │ "John" │ {"address":{"city":"New York","state":"NY"},"languages":["English","Spanish","French"],"name":"John"} │
└────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────┘
-- PATH helps in selecting elements at a specific path from the original JSON data.
SELECT
*
FROM
FLATTEN (
INPUT => PARSE_JSON (
'{"name": "John", "languages": ["English", "Spanish", "French"], "address": {"city": "New York", "state": "NY"}}'
),
PATH => 'languages'
);
┌──────────────────────────────────────────────────────────────────────────────────────────────────────────────────────┐
│ seq │ key │ path │ index │ value │ this │
├────────┼──────────────────┼──────────────────┼──────────────────┼───────────────────┼────────────────────────────────┤
│ 1 │ NULL │ languages[0] │ 0 │ "English" │ ["English","Spanish","French"] │
│ 1 │ NULL │ languages[1] │ 1 │ "Spanish" │ ["English","Spanish","French"] │
│ 1 │ NULL │ languages[2] │ 2 │ "French" │ ["English","Spanish","French"] │
└──────────────────────────────────────────────────────────────────────────────────────────────────────────────────────┘
-- RECURSIVE enables recursive flattening of nested structures.
SELECT
*
FROM
FLATTEN (
INPUT => PARSE_JSON (
'{"name": "John", "languages": ["English", "Spanish", "French"], "address": {"city": "New York", "state": "NY"}}'
),
RECURSIVE => TRUE
);
┌────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────┐
│ seq │ key │ path │ index │ value │ this │
├────────┼──────────────────┼──────────────────┼──────────────────┼──────────────────────────────────┼───────────────────────────────────────────────────────────────────────────────────────────────────────┤
│ 1 │ address │ address │ NULL │ {"city":"New York","state":"NY"} │ {"address":{"city":"New York","state":"NY"},"languages":["English","Spanish","French"],"name":"John"} │
│ 1 │ city │ address.city │ NULL │ "New York" │ {"city":"New York","state":"NY"} │
│ 1 │ state │ address.state │ NULL │ "NY" │ {"city":"New York","state":"NY"} │
│ 1 │ languages │ languages │ NULL │ ["English","Spanish","French"] │ {"address":{"city":"New York","state":"NY"},"languages":["English","Spanish","French"],"name":"John"} │
│ 1 │ NULL │ languages[0] │ 0 │ "English" │ ["English","Spanish","French"] │
│ 1 │ NULL │ languages[1] │ 1 │ "Spanish" │ ["English","Spanish","French"] │
│ 1 │ NULL │ languages[2] │ 2 │ "French" │ ["English","Spanish","French"] │
│ 1 │ name │ name │ NULL │ "John" │ {"address":{"city":"New York","state":"NY"},"languages":["English","Spanish","French"],"name":"John"} │
└────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────┘
-- MODE specifies whether only objects ('OBJECT'), only arrays ('ARRAY'), or both ('BOTH') should be flattened.
-- In this example, MODE => 'ARRAY' is used, which means that only arrays within the JSON data will be flattened.
SELECT
*
FROM
FLATTEN (
INPUT => PARSE_JSON (
'{"name": "John", "languages": ["English", "Spanish", "French"], "address": {"city": "New York", "state": "NY"}}'
),
MODE => 'ARRAY'
);
---
-- OUTER determines the inclusion of zero-row expansions in the output.
-- In this first example, OUTER => TRUE is used with an empty JSON array, which results in zero-row expansions.
-- Rows are included in the output even when there are no values to flatten.
SELECT
*
FROM
FLATTEN (INPUT => PARSE_JSON ('[]'), OUTER => TRUE);
┌─────────────────────────────────────────────────────────────────────────────────────────────────────────┐
│ seq │ key │ path │ index │ value │ this │
├────────┼──────────────────┼──────────────────┼──────────────────┼───────────────────┼───────────────────┤
│ 1 │ NULL │ NULL │ NULL │ NULL │ NULL │
└─────────────────────────────────────────────────────────────────────────────────────────────────────────┘
-- In this second example, OUTER is omitted, and the output shows how rows with zero results are not included when OUTER is not specified.
SELECT
*
FROM
FLATTEN (INPUT => PARSE_JSON ('[]'));
SQL Examples 2: Demonstrating LATERAL FLATTEN
This example demonstrates the behavior of the FLATTEN function when used in conjunction with the LATERAL keyword.
-- Create a table for Tim Hortons transactions with multiple items
CREATE TABLE tim_hortons_transactions (
transaction_id INT,
customer_id INT,
items VARIANT
);
-- Insert data for Tim Hortons transactions with multiple items
INSERT INTO tim_hortons_transactions (transaction_id, customer_id, items)
VALUES
(101, 1, parse_json('[{"item":"coffee", "price":2.50}, {"item":"donut", "price":1.20}]')),
(102, 2, parse_json('[{"item":"bagel", "price":1.80}, {"item":"muffin", "price":2.00}]')),
(103, 3, parse_json('[{"item":"timbit_assortment", "price":5.00}]'));
-- Show Tim Hortons transactions with multiple items using LATERAL FLATTEN
SELECT
t.transaction_id,
t.customer_id,
f.value:item::STRING AS purchased_item,
f.value:price::FLOAT AS price
FROM
tim_hortons_transactions t,
LATERAL FLATTEN(input => t.items) f;
┌───────────────────────────────────────────────────────────────────────────┐
│ transaction_id │ customer_id │ purchased_item │ price │
├─────────────────┼─────────────────┼───────────────────┼───────────────────┤
│ 101 │ 1 │ coffee │ 2.5 │
│ 101 │ 1 │ donut │ 1.2 │
│ 102 │ 2 │ bagel │ 1.8 │
│ 102 │ 2 │ muffin │ 2 │
│ 103 │ 3 │ timbit_assortment │ 5 │
└───────────────────────────────────────────────────────────────────────────┘
-- Find maximum, minimum, and average prices of the purchased items
SELECT
MAX(f.value:price::FLOAT) AS max_price,
MIN(f.value:price::FLOAT) AS min_price,
AVG(f.value:price::FLOAT) AS avg_price
FROM
tim_hortons_transactions t,
LATERAL FLATTEN(input => t.items) f;
┌───────────────────────────────────────────────────────────┐
│ max_price │ min_price │ avg_price │
├───────────────────┼───────────────────┼───────────────────┤
│ 5 │ 1.2 │ 2.5 │
└───────────────────────────────────────────────────────────┘
14.4 - GET
Extracts value from a Variant
that contains ARRAY
by index
, or a Variant
that contains OBJECT
by field_name
.
The value is returned as a Variant
or NULL
if either of the arguments is NULL
.
GET
applies case-sensitive matching to field_name
. For case-insensitive matching, use GET_IGNORE_CASE
.
Analyze Syntax
func.get(<variant>, <index>)
or
func.get(<variant>, <field_name>)
Analyze Example
func.get(func.parse_json('[2.71, 3.14]'), 0);
+----------------------------------------------+
| func.get(func.parse_json('[2.71, 3.14]'), 0) |
+----------------------------------------------+
| 2.71 |
+----------------------------------------------+
func.get(func.parse_json('{"aa":1, "aA":2, "Aa":3}'), 'aa');
+-------------------------------------------------------------+
| func.get(func.parse_json('{"aa":1, "aA":2, "Aa":3}'), 'aa') |
+-------------------------------------------------------------+
| 1 |
+-------------------------------------------------------------+
SQL Syntax
GET( <variant>, <index> )
GET( <variant>, <field_name> )
Arguments
Arguments | Description |
---|
<variant> | The VARIANT value that contains either an ARRAY or an OBJECT |
<index> | The Uint32 value specifies the position of the value in ARRAY |
<field_name> | The String value specifies the key in a key-value pair of OBJECT |
Return Type
VARIANT
SQL Examples
SELECT get(parse_json('[2.71, 3.14]'), 0);
+------------------------------------+
| get(parse_json('[2.71, 3.14]'), 0) |
+------------------------------------+
| 2.71 |
+------------------------------------+
SELECT get(parse_json('{"aa":1, "aA":2, "Aa":3}'), 'aa');
+---------------------------------------------------+
| get(parse_json('{"aa":1, "aA":2, "Aa":3}'), 'aa') |
+---------------------------------------------------+
| 1 |
+---------------------------------------------------+
SELECT get(parse_json('{"aa":1, "aA":2, "Aa":3}'), 'AA');
+---------------------------------------------------+
| get(parse_json('{"aa":1, "aA":2, "Aa":3}'), 'AA') |
+---------------------------------------------------+
| NULL |
+---------------------------------------------------+
14.5 - GET_IGNORE_CASE
Extracts value from a VARIANT
that contains OBJECT
by the field_name.
The value is returned as a Variant
or NULL
if either of the arguments is NULL
.
GET_IGNORE_CASE
is similar to GET
but applies case-insensitive matching to field names.
First match the exact same field name, if not found, match the case-insensitive field name alphabetically.
Analyze Syntax
func.get_ignore_Case(<variant>, <field_name>)
Analyze Example
func.get_ignore_case(func.parse_json('{"aa":1, "aA":2, "Aa":3}'), 'AA')
+-------------------------------------------------------------------------+
| func.get_ignore_case(func.parse_json('{"aa":1, "aA":2, "Aa":3}'), 'AA') |
+-------------------------------------------------------------------------+
| 3 |
+-------------------------------------------------------------------------+
SQL Syntax
GET_IGNORE_CASE( <variant>, <field_name> )
Arguments
Arguments | Description |
---|
<variant> | The VARIANT value that contains either an ARRAY or an OBJECT |
<field_name> | The String value specifies the key in a key-value pair of OBJECT |
Return Type
VARIANT
SQL Examples
SELECT get_ignore_case(parse_json('{"aa":1, "aA":2, "Aa":3}'), 'AA');
+---------------------------------------------------------------+
| get_ignore_case(parse_json('{"aa":1, "aA":2, "Aa":3}'), 'AA') |
+---------------------------------------------------------------+
| 3 |
+---------------------------------------------------------------+
14.6 - GET_PATH
Extracts value from a VARIANT
by path_name
.
The value is returned as a Variant
or NULL
if either of the arguments is NULL
.
GET_PATH
is equivalent to a chain of GET
functions, path_name
consists of a concatenation of field names preceded by periods (.), colons (:) or index operators ([index]
). The first field name does not require the leading identifier to be specified.
Analyze Syntax
func.get_path(<variant>, <path_name>)
Analyze Example
func.get_path(func.parse_json('{"k1":[0,1,2], "k2":{"k3":3,"k4":4}}'), 'k2.k4')
+---------------------------------------------------------------------------------+
| func.get_path(func.parse_json('{"k1":[0,1,2], "k2":{"k3":3,"k4":4}}'), 'k2.k4') |
+---------------------------------------------------------------------------------+
| 4 |
+---------------------------------------------------------------------------------+
SQL Syntax
GET_PATH( <variant>, <path_name> )
Arguments
Arguments | Description |
---|
<variant> | The VARIANT value that contains either an ARRAY or an OBJECT |
<path_name> | The String value that consists of a concatenation of field names |
Return Type
VARIANT
SQL Examples
SELECT get_path(parse_json('{"k1":[0,1,2], "k2":{"k3":3,"k4":4}}'), 'k1[0]');
+-----------------------------------------------------------------------+
| get_path(parse_json('{"k1":[0,1,2], "k2":{"k3":3,"k4":4}}'), 'k1[0]') |
+-----------------------------------------------------------------------+
| 0 |
+-----------------------------------------------------------------------+
SELECT get_path(parse_json('{"k1":[0,1,2], "k2":{"k3":3,"k4":4}}'), 'k2:k3');
+-----------------------------------------------------------------------+
| get_path(parse_json('{"k1":[0,1,2], "k2":{"k3":3,"k4":4}}'), 'k2:k3') |
+-----------------------------------------------------------------------+
| 3 |
+-----------------------------------------------------------------------+
SELECT get_path(parse_json('{"k1":[0,1,2], "k2":{"k3":3,"k4":4}}'), 'k2.k4');
+-----------------------------------------------------------------------+
| get_path(parse_json('{"k1":[0,1,2], "k2":{"k3":3,"k4":4}}'), 'k2.k4') |
+-----------------------------------------------------------------------+
| 4 |
+-----------------------------------------------------------------------+
SELECT get_path(parse_json('{"k1":[0,1,2], "k2":{"k3":3,"k4":4}}'), 'k2.k5');
+-----------------------------------------------------------------------+
| get_path(parse_json('{"k1":[0,1,2], "k2":{"k3":3,"k4":4}}'), 'k2.k5') |
+-----------------------------------------------------------------------+
| NULL |
+-----------------------------------------------------------------------+
14.7 - IS_ARRAY
Checks if the input value is a JSON array. Please note that a JSON array is not the same as the ARRAY data type. A JSON array is a data structure commonly used in JSON, representing an ordered collection of values enclosed within square brackets [ ]
. It is a flexible format for organizing and exchanging various data types, including strings, numbers, booleans, objects, and nulls.
[
"Apple",
42,
true,
{"name": "John", "age": 30, "isStudent": false},
[1, 2, 3],
null
]
Analyze Syntax
Analyze Example
func.is_array(func.parse_json('true')), func.is_array(func.parse_json('[1,2,3]'))
┌────────────────────────────────────────────────────────────────────────────────────┐
│ func.is_array(func.parse_json('true')) │ func.is_array(func.parse_json('[1,2,3]')) │
├────────────────────────────────────────┼───────────────────────────────────────────┤
│ false │ true │
└────────────────────────────────────────────────────────────────────────────────────┘
SQL Syntax
Return Type
Returns true
if the input value is a JSON array, and false
otherwise.
SQL Examples
SELECT
IS_ARRAY(PARSE_JSON('true')),
IS_ARRAY(PARSE_JSON('[1,2,3]'));
┌────────────────────────────────────────────────────────────────┐
│ is_array(parse_json('true')) │ is_array(parse_json('[1,2,3]')) │
├──────────────────────────────┼─────────────────────────────────┤
│ false │ true │
└────────────────────────────────────────────────────────────────┘
14.8 - IS_BOOLEAN
Checks if the input JSON value is a boolean.
Analyze Syntax
Analyze Example
func.is_boolean(func.parse_json('true')), func.is_boolean(func.parse_json('[1,2,3]'))
┌────────────────────────────────────────────────────────────────────────────────────────┐
│ func.is_boolean(func.parse_json('true')) │ func.is_boolean(func.parse_json('[1,2,3]')) │
├──────────────────────────────────────────┼─────────────────────────────────────────────┤
│ true │ false │
└────────────────────────────────────────────────────────────────────────────────────────┘
SQL Syntax
Return Type
Returns true
if the input JSON value is a boolean, and false
otherwise.
SQL Examples
SELECT
IS_BOOLEAN(PARSE_JSON('true')),
IS_BOOLEAN(PARSE_JSON('[1,2,3]'));
┌────────────────────────────────────────────────────────────────────┐
│ is_boolean(parse_json('true')) │ is_boolean(parse_json('[1,2,3]')) │
├────────────────────────────────┼───────────────────────────────────┤
│ true │ false │
└────────────────────────────────────────────────────────────────────┘
14.9 - IS_FLOAT
Checks if the input JSON value is a float.
Analyze Syntax
Analyze Example
func.is_float(func.parse_json('1.23')), func.is_float(func.parse_json('[1,2,3]'))
┌────────────────────────────────────────────────────────────────────────────────────────┐
│ func.is_float(func.parse_json('1.23')) │ func.is_float(func.parse_json('[1,2,3]')) │
├──────────────────────────────────────────┼─────────────────────────────────────────────┤
│ true │ false │
└────────────────────────────────────────────────────────────────────────────────────────┘
SQL Syntax
Return Type
Returns true
if the input JSON value is a float, and false
otherwise.
SQL Examples
SELECT
IS_FLOAT(PARSE_JSON('1.23')),
IS_FLOAT(PARSE_JSON('[1,2,3]'));
┌────────────────────────────────────────────────────────────────┐
│ is_float(parse_json('1.23')) │ is_float(parse_json('[1,2,3]')) │
├──────────────────────────────┼─────────────────────────────────┤
│ true │ false │
└────────────────────────────────────────────────────────────────┘
14.10 - IS_INTEGER
Checks if the input JSON value is an integer.
Analyze Syntax
Analyze Example
func.is_integer(func.parse_json('123')), func.is_integer(func.parse_json('[1,2,3]'))
┌────────────────────────────────────────────────────────────────────────────────────────┐
│ func.is_integer(func.parse_json('123')) │ func.is_integer(func.parse_json('[1,2,3]')) │
├──────────────────────────────────────────┼─────────────────────────────────────────────┤
│ true │ false │
└────────────────────────────────────────────────────────────────────────────────────────┘
SQL Syntax
Return Type
Returns true
if the input JSON value is an integer, and false
otherwise.
SQL Examples
SELECT
IS_INTEGER(PARSE_JSON('123')),
IS_INTEGER(PARSE_JSON('[1,2,3]'));
┌───────────────────────────────────────────────────────────────────┐
│ is_integer(parse_json('123')) │ is_integer(parse_json('[1,2,3]')) │
├───────────────────────────────┼───────────────────────────────────┤
│ true │ false │
└───────────────────────────────────────────────────────────────────┘
14.11 - IS_NULL_VALUE
import FunctionDescription from '@site/src/components/FunctionDescription';
Checks whether the input value is a JSON null
. Please note that this function examines JSON null
, not SQL NULL. To check if a value is SQL NULL, use IS_NULL.
{
"name": "John",
"age": null
}
Analyze Syntax
func.is_null_value(<expr>)
Analyze Example
func.is_null_value(func.get_path(func.parse_json('{"k1":[0,1,2], "k2":{"k3":3,"k4":4}}'), 'k2.k5'))
┌─────────────────────────────────────────────────────────────────────────────────────────────────────┐
│ func.is_null_value(func.get_path(func.parse_json('{"k1":[0,1,2], "k2":{"k3":3,"k4":4}}'), 'k2.k5')) │
├─────────────────────────────────────────────────────────────────────────────────────────────────────┤
│ true │
└─────────────────────────────────────────────────────────────────────────────────────────────────────┘
SQL Syntax
Return Type
Returns true
if the input value is a JSON null
, and false
otherwise.
SQL Examples
SELECT
IS_NULL_VALUE(PARSE_JSON('{"name":"John", "age":null}') :age), --JSON null
IS_NULL(NULL); --SQL NULL
┌──────────────────────────────────────────────────────────────────────────────┐
│ is_null_value(parse_json('{"name":"john", "age":null}'):age) │ is_null(null) │
├──────────────────────────────────────────────────────────────┼───────────────┤
│ true │ true │
└──────────────────────────────────────────────────────────────────────────────┘
14.12 - IS_OBJECT
Checks if the input value is a JSON object.
Analyze Syntax
Analyze Example
func.is_object(func.parse_json('{"a":"b"}')), func.is_object(func.parse_json('["a","b","c"]'))
┌──────────────────────────────────────────────────────────────────────────────────────────────────┐
│ func.is_object(func.parse_json('{"a":"b"}')) │ func.is_object(func.parse_json('["a","b","c"]')) │
├───────────────────────────────────────────────┼──────────────────────────────────────────────────┤
│ true │ false │
└──────────────────────────────────────────────────────────────────────────────────────────────────┘
SQL Syntax
Return Type
Returns true
if the input JSON value is a JSON object, and false
otherwise.
SQL Examples
SELECT
IS_OBJECT(PARSE_JSON('{"a":"b"}')), -- JSON Object
IS_OBJECT(PARSE_JSON('["a","b","c"]')); --JSON Array
┌─────────────────────────────────────────────────────────────────────────────┐
│ is_object(parse_json('{"a":"b"}')) │ is_object(parse_json('["a","b","c"]')) │
├────────────────────────────────────┼────────────────────────────────────────┤
│ true │ false │
└─────────────────────────────────────────────────────────────────────────────┘
14.13 - IS_STRING
Checks if the input JSON value is a string.
Analyze Syntax
Analyze Example
func.is_string(func.parse_json('"abc"')), func.is_string(func.parse_json('123'))
┌──────────────────────────────────────────────────────────────────────────────────────────────────┐
│ func.is_string(func.parse_json('"abc"')) │ func.is_string(func.parse_json('123')) │
├───────────────────────────────────────────────┼──────────────────────────────────────────────────┤
│ true │ false │
└──────────────────────────────────────────────────────────────────────────────────────────────────┘
SQL Syntax
Return Type
Returns true
if the input JSON value is a string, and false
otherwise.
SQL Examples
SELECT
IS_STRING(PARSE_JSON('"abc"')),
IS_STRING(PARSE_JSON('123'));
┌───────────────────────────────────────────────────────────────┐
│ is_string(parse_json('"abc"')) │ is_string(parse_json('123')) │
├────────────────────────────────┼──────────────────────────────┤
│ true │ false │
└───────────────────────────────────────────────────────────────┘
14.14 - JSON_ARRAY
Creates a JSON array with specified values.
Analyze Syntax
func.json_array(value1[, value2[, ...]])
Analyze Example
func.json_array('fruits', func.json_array('apple', 'banana', 'orange'), func.json_object('price', 1.2, 'quantity', 3)) |
-----------------------------------------------------------------------------------------------------------------------+
["fruits",["apple","banana","orange"],{"price":1.2,"quantity":3}] |
SQL Syntax
JSON_ARRAY(value1[, value2[, ...]])
Return Type
JSON array.
SQL Examples
SQL Examples 1: Creating JSON Array with Constant Values or Expressions
SELECT JSON_ARRAY('PlaidCloud Lakehouse', 3.14, NOW(), TRUE, NULL);
json_array('databend', 3.14, now(), true, null) |
--------------------------------------------------------+
["PlaidCloud Lakehouse",3.14,"2023-09-06 07:23:55.399070",true,null]|
SELECT JSON_ARRAY('fruits', JSON_ARRAY('apple', 'banana', 'orange'), JSON_OBJECT('price', 1.2, 'quantity', 3));
json_array('fruits', json_array('apple', 'banana', 'orange'), json_object('price', 1.2, 'quantity', 3))|
-------------------------------------------------------------------------------------------------------+
["fruits",["apple","banana","orange"],{"price":1.2,"quantity":3}] |
SQL Examples 2: Creating JSON Array from Table Data
CREATE TABLE products (
ProductName VARCHAR(255),
Price DECIMAL(10, 2)
);
INSERT INTO products (ProductName, Price)
VALUES
('Apple', 1.2),
('Banana', 0.5),
('Orange', 0.8);
SELECT JSON_ARRAY(ProductName, Price) FROM products;
json_array(productname, price)|
------------------------------+
["Apple",1.2] |
["Banana",0.5] |
["Orange",0.8] |
14.15 - JSON_ARRAY_ELEMENTS
Extracts the elements from a JSON array, returning them as individual rows in the result set. JSON_ARRAY_ELEMENTS does not recursively expand nested arrays; it treats them as single elements.
Analyze Syntax
func.json_array_elements(<json_string>)
Analyze Example
func.json_array_elements(func.parse_json('[ \n {"product": "laptop", "brand": "apple", "price": 1500},\n {"product": "smartphone", "brand": "samsung", "price": 800},\n {"product": "headphones", "brand": "sony", "price": 150}\n]'))
┌───────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────┐
│ func.json_array_elements(func.parse_json('[ \n {"product": "laptop", "brand": "apple", "price": 1500},\n {"product": "smartphone", "brand": "samsung", "price": 800},\n {"product": "headphones", "brand": "sony", "price": 150}\n]')) │
├───────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────┤
│ {"brand":"Apple","price":1500,"product":"Laptop"} │
│ {"brand":"Samsung","price":800,"product":"Smartphone"} │
│ {"brand":"Sony","price":150,"product":"Headphones"} │
└───────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────┘
SQL Syntax
JSON_ARRAY_ELEMENTS(<json_string>)
Return Type
JSON_ARRAY_ELEMENTS returns a set of VARIANT values, each representing an element extracted from the input JSON array.
SQL Examples
-- Extract individual elements from a JSON array containing product information
SELECT
JSON_ARRAY_ELEMENTS(
PARSE_JSON (
'[
{"product": "Laptop", "brand": "Apple", "price": 1500},
{"product": "Smartphone", "brand": "Samsung", "price": 800},
{"product": "Headphones", "brand": "Sony", "price": 150}
]'
)
);
┌─────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────┐
│ json_array_elements(parse_json('[ \n {"product": "laptop", "brand": "apple", "price": 1500},\n {"product": "smartphone", "brand": "samsung", "price": 800},\n {"product": "headphones", "brand": "sony", "price": 150}\n]')) │
├─────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────┤
│ {"brand":"Apple","price":1500,"product":"Laptop"} │
│ {"brand":"Samsung","price":800,"product":"Smartphone"} │
│ {"brand":"Sony","price":150,"product":"Headphones"} │
└─────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────┘
-- Display data types of the extracted elements
SELECT
TYPEOF (
JSON_ARRAY_ELEMENTS(
PARSE_JSON (
'[
{"product": "Laptop", "brand": "Apple", "price": 1500},
{"product": "Smartphone", "brand": "Samsung", "price": 800},
{"product": "Headphones", "brand": "Sony", "price": 150}
]'
)
)
);
┌─────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────┐
│ typeof(json_array_elements(parse_json('[ \n {"product": "laptop", "brand": "apple", "price": 1500},\n {"product": "smartphone", "brand": "samsung", "price": 800},\n {"product": "headphones", "brand": "sony", "price": 150}\n]'))) │
├─────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────┤
│ VARIANT NULL │
│ VARIANT NULL │
│ VARIANT NULL │
└─────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────┘
14.16 - JSON_EACH
Extracts key-value pairs from a JSON object, breaking down the structure into individual rows in the result set. Each row represents a distinct key-value pair derived from the input JSON expression.
Analyze Syntax
func.json_each(<json_string>)
Analyze Example
┌────────────────────────────────────────────────────────────────────────────────────────────────────────────┐
│ func.json_each(func.parse_json('{"name": "john", "age": 25, "isstudent": false, "grades": [90, 85, 92]}')) │
├────────────────────────────────────────────────────────────────────────────────────────────────────────────┤
│ ('age','25') │
│ ('grades','[90,85,92]') │
│ ('isStudent','false') │
│ ('name','"John"') │
└────────────────────────────────────────────────────────────────────────────────────────────────────────────┘
SQL Syntax
Return Type
JSON_EACH returns a set of tuples, each consisting of a STRING key and a corresponding VARIANT value.
SQL Examples
-- Extract key-value pairs from a JSON object representing information about a person
SELECT
JSON_EACH(
PARSE_JSON (
'{"name": "John", "age": 25, "isStudent": false, "grades": [90, 85, 92]}'
)
);
┌──────────────────────────────────────────────────────────────────────────────────────────────────┐
│ json_each(parse_json('{"name": "john", "age": 25, "isstudent": false, "grades": [90, 85, 92]}')) │
├──────────────────────────────────────────────────────────────────────────────────────────────────┤
│ ('age','25') │
│ ('grades','[90,85,92]') │
│ ('isStudent','false') │
│ ('name','"John"') │
└──────────────────────────────────────────────────────────────────────────────────────────────────┘
-- Display data types of the extracted values
SELECT
TYPEOF (
JSON_EACH(
PARSE_JSON (
'{"name": "John", "age": 25, "isStudent": false, "grades": [90, 85, 92]}'
)
)
);
┌──────────────────────────────────────────────────────────────────────────────────────────────────────────┐
│ typeof(json_each(parse_json('{"name": "john", "age": 25, "isstudent": false, "grades": [90, 85, 92]}'))) │
├──────────────────────────────────────────────────────────────────────────────────────────────────────────┤
│ TUPLE(STRING, VARIANT) NULL │
│ TUPLE(STRING, VARIANT) NULL │
│ TUPLE(STRING, VARIANT) NULL │
│ TUPLE(STRING, VARIANT) NULL │
└──────────────────────────────────────────────────────────────────────────────────────────────────────────┘
14.17 - JSON_EXTRACT_PATH_TEXT
Extracts value from a Json string by path_name
.
The value is returned as a String
or NULL
if either of the arguments is NULL
.
This function is equivalent to to_varchar(GET_PATH(PARSE_JSON(JSON), PATH_NAME))
.
Analyze Syntax
func.json_extract_path_text(<expr>, <path_name>)
Analyze Example
func.json_extract_path_text('{"k1":[0,1,2], "k2":{"k3":3,"k4":4}}', 'k2.k4')
+------------------------------------------------------------------------------+
| func.json_extract_path_text('{"k1":[0,1,2], "k2":{"k3":3,"k4":4}}', 'k2.k4') |
+------------------------------------------------------------------------------+
| 4 |
+------------------------------------------------------------------------------+
SQL Syntax
JSON_EXTRACT_PATH_TEXT( <expr>, <path_name> )
Arguments
Arguments | Description |
---|
<expr> | The Json String value |
<path_name> | The String value that consists of a concatenation of field names |
Return Type
String
SQL Examples
SELECT json_extract_path_text('{"k1":[0,1,2], "k2":{"k3":3,"k4":4}}', 'k1[0]');
+-------------------------------------------------------------------------+
| json_extract_path_text('{"k1":[0,1,2], "k2":{"k3":3,"k4":4}}', 'k1[0]') |
+-------------------------------------------------------------------------+
| 0 |
+-------------------------------------------------------------------------+
SELECT json_extract_path_text('{"k1":[0,1,2], "k2":{"k3":3,"k4":4}}', 'k2:k3');
+-------------------------------------------------------------------------+
| json_extract_path_text('{"k1":[0,1,2], "k2":{"k3":3,"k4":4}}', 'k2:k3') |
+-------------------------------------------------------------------------+
| 3 |
+-------------------------------------------------------------------------+
SELECT json_extract_path_text('{"k1":[0,1,2], "k2":{"k3":3,"k4":4}}', 'k2.k4');
+-------------------------------------------------------------------------+
| json_extract_path_text('{"k1":[0,1,2], "k2":{"k3":3,"k4":4}}', 'k2.k4') |
+-------------------------------------------------------------------------+
| 4 |
+-------------------------------------------------------------------------+
SELECT json_extract_path_text('{"k1":[0,1,2], "k2":{"k3":3,"k4":4}}', 'k2.k5');
+-------------------------------------------------------------------------+
| json_extract_path_text('{"k1":[0,1,2], "k2":{"k3":3,"k4":4}}', 'k2.k5') |
+-------------------------------------------------------------------------+
| NULL |
+-------------------------------------------------------------------------+
14.18 - JSON_OBJECT_KEYS
Returns an Array containing the list of keys in the input Variant OBJECT.
Analyze Syntax
func.json_object_keys(<variant>)
Analyze Example
func.json_object_keys(func.parse_json(parse_json('{"a": 1, "b": [1,2,3]}')), func.json_object_keys(func.parse_json(parse_json('{"b": [2,3,4]}'))
┌─────────────────────────────────────────────────────────────────┐
│ id │ json_object_keys(var) │ json_object_keys(var) │
├────────────────┼────────────────────────┼───────────────────────┤
│ 1 │ ["a","b"] │ ["a","b"] │
│ 2 │ ["b"] │ ["b"] │
└─────────────────────────────────────────────────────────────────┘
SQL Syntax
JSON_OBJECT_KEYS(<variant>)
Arguments
Arguments | Description |
---|
<variant> | The VARIANT value that contains an OBJECT |
Aliases
Return Type
Array<String>
SQL Examples
CREATE TABLE IF NOT EXISTS objects_test1(id TINYINT, var VARIANT);
INSERT INTO
objects_test1
VALUES
(1, parse_json('{"a": 1, "b": [1,2,3]}'));
INSERT INTO
objects_test1
VALUES
(2, parse_json('{"b": [2,3,4]}'));
SELECT
id,
object_keys(var),
json_object_keys(var)
FROM
objects_test1;
┌────────────────────────────────────────────────────────────┐
│ id │ object_keys(var) │ json_object_keys(var) │
├────────────────┼───────────────────┼───────────────────────┤
│ 1 │ ["a","b"] │ ["a","b"] │
│ 2 │ ["b"] │ ["b"] │
└────────────────────────────────────────────────────────────┘
14.19 - JSON_PATH_EXISTS
Checks whether a specified path exists in JSON data.
Analyze Syntax
func.json_path_exists(<json_data>, <json_path_expression)
Analyze Example
func.json_path_exists(parse_json('{"a": 1, "b": 2}'), '$.a ? (@ == 1)'), func.json_path_exists(parse_json('{"a": 1, "b": 2}'), '$.a ? (@ > 1)')
┌─────────────────────────────┐
│ Item 1 │ Item 2 │
├────────────────┼────────────┤
│ True │ False │
└─────────────────────────────┘
SQL Syntax
JSON_PATH_EXISTS(<json_data>, <json_path_expression>)
json_data: Specifies the JSON data you want to search within. It can be a JSON object or an array.
json_path_expression: Specifies the path, starting from the root of the JSON data represented by $
, that you want to check within the JSON data. You can also include conditions within the expression, using @
to refer to the current node or element being evaluated, to filter the results.
Return Type
The function returns:
true
if the specified JSON path (and conditions if any) exists within the JSON data.false
if the specified JSON path (and conditions if any) does not exist within the JSON data.- NULL if either the json_data or json_path_expression is NULL or invalid.
SQL Examples
SELECT JSON_PATH_EXISTS(parse_json('{"a": 1, "b": 2}'), '$.a ? (@ == 1)');
----
true
SELECT JSON_PATH_EXISTS(parse_json('{"a": 1, "b": 2}'), '$.a ? (@ > 1)');
----
false
SELECT JSON_PATH_EXISTS(NULL, '$.a');
----
NULL
SELECT JSON_PATH_EXISTS(parse_json('{"a": 1, "b": 2}'), NULL);
----
NULL
14.20 - JSON_PATH_MATCH
Checks whether a specified JSON path expression matches certain conditions within a JSON data. Please note that the @@
operator is synonymous with this function. For more information, see JSON Operators.
Analyze Syntax
func.json_path_match(<json_data>, <json_path_expression)
Analyze Example
func.json_path_match(func.parse_json('{"a":1,"b":[1,2,3]}'), '$.a == 1')
┌──────────────────────────────────────────────────────────────────────────┐
│ func.json_path_match(func.parse_json('{"a":1,"b":[1,2,3]}'), '$.a == 1') │
├──────────────────────────────────────────────────────────────────────────┤
│ true │
└──────────────────────────────────────────────────────────────────────────┘
func.json_path_match(func.parse_json('{"a":1,"b":[1,2,3]}'), '$.b[0] > 1')
┌────────────────────────────────────────────────────────────────────────────┐
│ func.json_path_match(func.parse_json('{"a":1,"b":[1,2,3]}'), '$.b[0] > 1') │
├────────────────────────────────────────────────────────────────────────────┤
│ false │
└────────────────────────────────────────────────────────────────────────────┘
SQL Syntax
JSON_PATH_MATCH(<json_data>, <json_path_expression>)
json_data
: Specifies the JSON data you want to examine. It can be a JSON object or an array.json_path_expression
: Specifies the conditions to be checked within the JSON data. This expression describes the specific path or criteria to be matched, such as verifying whether specific field values in the JSON structure meet certain conditions. The $
symbol represents the root of the JSON data. It is used to start the path expression and indicates the top-level object in the JSON structure.
Return Type
The function returns:
true
if the specified JSON path expression matches the conditions within the JSON data.false
if the specified JSON path expression does not match the conditions within the JSON data.- NULL if either
json_data
or json_path_expression
is NULL or invalid.
SQL Examples
-- Check if the value at JSON path $.a is equal to 1
SELECT JSON_PATH_MATCH(parse_json('{"a":1,"b":[1,2,3]}'), '$.a == 1');
┌────────────────────────────────────────────────────────────────┐
│ json_path_match(parse_json('{"a":1,"b":[1,2,3]}'), '$.a == 1') │
├────────────────────────────────────────────────────────────────┤
│ true │
└────────────────────────────────────────────────────────────────┘
-- Check if the first element in the array at JSON path $.b is greater than 1
SELECT JSON_PATH_MATCH(parse_json('{"a":1,"b":[1,2,3]}'), '$.b[0] > 1');
┌──────────────────────────────────────────────────────────────────┐
│ json_path_match(parse_json('{"a":1,"b":[1,2,3]}'), '$.b[0] > 1') │
├──────────────────────────────────────────────────────────────────┤
│ false │
└──────────────────────────────────────────────────────────────────┘
-- Check if any element in the array at JSON path $.b
-- from the second one to the last are greater than or equal to 2
SELECT JSON_PATH_MATCH(parse_json('{"a":1,"b":[1,2,3]}'), '$.b[1 to last] >= 2');
┌───────────────────────────────────────────────────────────────────────────┐
│ json_path_match(parse_json('{"a":1,"b":[1,2,3]}'), '$.b[1 to last] >= 2') │
├───────────────────────────────────────────────────────────────────────────┤
│ true │
└───────────────────────────────────────────────────────────────────────────┘
-- NULL is returned if either the json_data or json_path_expression is NULL or invalid.
SELECT JSON_PATH_MATCH(parse_json('{"a":1,"b":[1,2,3]}'), NULL);
┌──────────────────────────────────────────────────────────┐
│ json_path_match(parse_json('{"a":1,"b":[1,2,3]}'), null) │
├──────────────────────────────────────────────────────────┤
│ NULL │
└──────────────────────────────────────────────────────────┘
SELECT JSON_PATH_MATCH(NULL, '$.a == 1');
┌───────────────────────────────────┐
│ json_path_match(null, '$.a == 1') │
├───────────────────────────────────┤
│ NULL │
└───────────────────────────────────┘
14.21 - JSON_PATH_QUERY
Get all JSON items returned by JSON path for the specified JSON value.
Analyze Syntax
func.json_path_query(<variant>, <path_name>)
Analyze Example
table.name, func.json_path_query(table.details, '$.features.*').alias('all_features')
+------------+--------------+
| name | all_features |
+------------+--------------+
| Laptop | "16GB" |
| Laptop | "512GB" |
| Smartphone | "4GB" |
| Smartphone | "128GB" |
| Headphones | "20h" |
| Headphones | "5.0" |
+------------+--------------+
SQL Syntax
JSON_PATH_QUERY(<variant>, '<path_name>')
Return Type
VARIANT
SQL Examples
Create a Table and Insert Sample Data
CREATE TABLE products (
name VARCHAR,
details VARIANT
);
INSERT INTO products (name, details)
VALUES ('Laptop', '{"brand": "Dell", "colors": ["Black", "Silver"], "price": 1200, "features": {"ram": "16GB", "storage": "512GB"}}'),
('Smartphone', '{"brand": "Apple", "colors": ["White", "Black"], "price": 999, "features": {"ram": "4GB", "storage": "128GB"}}'),
('Headphones', '{"brand": "Sony", "colors": ["Black", "Blue", "Red"], "price": 150, "features": {"battery": "20h", "bluetooth": "5.0"}}');
Query Demo: Extracting All Features from Product Details
SELECT
name,
JSON_PATH_QUERY(details, '$.features.*') AS all_features
FROM
products;
Result
+------------+--------------+
| name | all_features |
+------------+--------------+
| Laptop | "16GB" |
| Laptop | "512GB" |
| Smartphone | "4GB" |
| Smartphone | "128GB" |
| Headphones | "20h" |
| Headphones | "5.0" |
+------------+--------------+
14.22 - JSON_PATH_QUERY_ARRAY
Get all JSON items returned by JSON path for the specified JSON value and wrap a result into an array.
Analyze Syntax
func.json_path_query_array(<variant>, <path_name>)
Analyze Example
table.name, func.json_path_query_array(table.details, '$.features.*').alias('all_features')
name | all_features
------------+-----------------------
Laptop | ["16GB", "512GB"]
Smartphone | ["4GB", "128GB"]
Headphones | ["20h", "5.0"]
SQL Syntax
JSON_PATH_QUERY_ARRAY(<variant>, '<path_name>')
Return Type
VARIANT
SQL Examples
Create a Table and Insert Sample Data
CREATE TABLE products (
name VARCHAR,
details VARIANT
);
INSERT INTO products (name, details)
VALUES ('Laptop', '{"brand": "Dell", "colors": ["Black", "Silver"], "price": 1200, "features": {"ram": "16GB", "storage": "512GB"}}'),
('Smartphone', '{"brand": "Apple", "colors": ["White", "Black"], "price": 999, "features": {"ram": "4GB", "storage": "128GB"}}'),
('Headphones', '{"brand": "Sony", "colors": ["Black", "Blue", "Red"], "price": 150, "features": {"battery": "20h", "bluetooth": "5.0"}}');
Query Demo: Extracting All Features from Product Details as an Array
SELECT
name,
JSON_PATH_QUERY_ARRAY(details, '$.features.*') AS all_features
FROM
products;
Result
name | all_features
-----------+-----------------------
Laptop | ["16GB", "512GB"]
Smartphone | ["4GB", "128GB"]
Headphones | ["20h", "5.0"]
14.23 - JSON_PATH_QUERY_FIRST
Get the first JSON item returned by JSON path for the specified JSON value.
Analyze Syntax
func.json_path_query_first(<variant>, <path_name>)
Analyze Example
table.name, func.json_path_query_first(table.details, '$.features.*').alias('first_feature')
+------------+---------------+
| name | first_feature |
+------------+---------------+
| Laptop | "16GB" |
| Laptop | "16GB" |
| Smartphone | "4GB" |
| Smartphone | "4GB" |
| Headphones | "20h" |
| Headphones | "20h" |
+------------+---------------+
SQL Syntax
JSON_PATH_QUERY_FIRST(<variant>, '<path_name>')
Return Type
VARIANT
SQL Examples
Create a Table and Insert Sample Data
CREATE TABLE products (
name VARCHAR,
details VARIANT
);
INSERT INTO products (name, details)
VALUES ('Laptop', '{"brand": "Dell", "colors": ["Black", "Silver"], "price": 1200, "features": {"ram": "16GB", "storage": "512GB"}}'),
('Smartphone', '{"brand": "Apple", "colors": ["White", "Black"], "price": 999, "features": {"ram": "4GB", "storage": "128GB"}}'),
('Headphones', '{"brand": "Sony", "colors": ["Black", "Blue", "Red"], "price": 150, "features": {"battery": "20h", "bluetooth": "5.0"}}');
Query Demo: Extracting the First Feature from Product Details
SELECT
name,
JSON_PATH_QUERY(details, '$.features.*') AS all_features,
JSON_PATH_QUERY_FIRST(details, '$.features.*') AS first_feature
FROM
products;
Result
+------------+--------------+---------------+
| name | all_features | first_feature |
+------------+--------------+---------------+
| Laptop | "16GB" | "16GB" |
| Laptop | "512GB" | "16GB" |
| Smartphone | "4GB" | "4GB" |
| Smartphone | "128GB" | "4GB" |
| Headphones | "20h" | "20h" |
| Headphones | "5.0" | "20h" |
+------------+--------------+---------------+
14.24 - JSON_PRETTY
Formats JSON data, making it more readable and presentable. It automatically adds indentation, line breaks, and other formatting to the JSON data for better visual representation.
Analyze Syntax
func.json_pretty(<json_string>)
Analyze Example
func.json_pretty(func.parse_json('{"person": {"name": "bob", "age": 25}, "location": "city"}'))
┌─────────────────────────────────────────────────────────────────────────────────────────────────┐
│ func.json_pretty(func.parse_json('{"person": {"name": "bob", "age": 25}, "location": "city"}')) │
│ String │
├─────────────────────────────────────────────────────────────────────────────────────────────────┤
│ { │
│ "location": "City", │
│ "person": { │
│ "age": 25, │
│ "name": "Bob" │
│ } │
│ } │
└─────────────────────────────────────────────────────────────────────────────────────────────────┘
SQL Syntax
JSON_PRETTY(<json_string>)
Return Type
String.
SQL Examples
SELECT JSON_PRETTY(PARSE_JSON('{"name":"Alice","age":30}'));
---
┌──────────────────────────────────────────────────────┐
│ json_pretty(parse_json('{"name":"alice","age":30}')) │
│ String │
├──────────────────────────────────────────────────────┤
│ { │
│ "age": 30, │
│ "name": "Alice" │
│ } │
└──────────────────────────────────────────────────────┘
SELECT JSON_PRETTY(PARSE_JSON('{"person": {"name": "Bob", "age": 25}, "location": "City"}'));
---
┌───────────────────────────────────────────────────────────────────────────────────────┐
│ json_pretty(parse_json('{"person": {"name": "bob", "age": 25}, "location": "city"}')) │
│ String │
├───────────────────────────────────────────────────────────────────────────────────────┤
│ { │
│ "location": "City", │
│ "person": { │
│ "age": 25, │
│ "name": "Bob" │
│ } │
│ } │
└───────────────────────────────────────────────────────────────────────────────────────┘
14.25 - JSON_STRIP_NULLS
Removes all properties with null values from a JSON object.
Analyze Syntax
func.json_strip_nulls(<json_string>)
Analyze Example
func.json_strip_nulls(func.parse_json('{"name": "alice", "age": 30, "city": null}'))
┌─────────────────────────────────────────────────────────────────────────────────────────────────┐
│ func.json_strip_nulls(func.parse_json('{"name": "alice", "age": 30, "city": null}')) │
│ String │
├─────────────────────────────────────────────────────────────────────────────────────────────────┤
│ {"age":30,"name":"Alice"} │
└─────────────────────────────────────────────────────────────────────────────────────────────────┘
SQL Syntax
JSON_STRIP_NULLS(<json_string>)
Return Type
Returns a value of the same type as the input JSON value.
SQL Examples
SELECT JSON_STRIP_NULLS(PARSE_JSON('{"name": "Alice", "age": 30, "city": null}'));
json_strip_nulls(parse_json('{"name": "alice", "age": 30, "city": null}'))|
--------------------------------------------------------------------------+
{"age":30,"name":"Alice"} |
14.26 - JSON_TO_STRING
Alias for TO_STRING.
14.27 - JSON_TYPEOF
Returns the type of the main-level of a JSON structure.
Analyze Syntax
func.json_typeof(<json_string>)
Analyze Example
func.json_typeof(func.parse_json('null'))|
-----------------------------------------+
null |
--
func.json_typeof(func.parse_json('true'))|
-----------------------------------------+
boolean |
--
func.json_typeof(func.parse_json('"plaidcloud"'))|
-----------------------------------------------+
string |
--
func.json_typeof(func.parse_json('-1.23'))|
------------------------------------------+
number |
--
func.json_typeof(func.parse_json('[1,2,3]'))|
--------------------------------------------+
array |
--
func.json_typeof(func.parse_json('{"name": "alice", "age": 30}'))|
-----------------------------------------------------------------+
object |
SQL Syntax
JSON_TYPEOF(<json_string>)
Return Type
The return type of the json_typeof function (or similar) is a string that indicates the data type of the parsed JSON value. The possible return values are: 'null', 'boolean', 'string', 'number', 'array', and 'object'.
SQL Examples
-- Parsing a JSON value that is NULL
SELECT JSON_TYPEOF(PARSE_JSON(NULL));
--
func.json_typeof(func.parse_json(null))|
-----------------------------+
|
-- Parsing a JSON value that is the string 'null'
SELECT JSON_TYPEOF(PARSE_JSON('null'));
--
func.json_typeof(func.parse_json('null'))|
-------------------------------+
null |
SELECT JSON_TYPEOF(PARSE_JSON('true'));
--
func.json_typeof(func.parse_json('true'))|
-------------------------------+
boolean |
SELECT JSON_TYPEOF(PARSE_JSON('"PlaidCloud Lakehouse"'));
--
func.json_typeof(func.parse_json('"databend"'))|
-------------------------------------+
string |
SELECT JSON_TYPEOF(PARSE_JSON('-1.23'));
--
func.json_typeof(func.parse_json('-1.23'))|
--------------------------------+
number |
SELECT JSON_TYPEOF(PARSE_JSON('[1,2,3]'));
--
func.json_typeof(func.parse_json('[1,2,3]'))|
----------------------------------+
array |
SELECT JSON_TYPEOF(PARSE_JSON('{"name": "Alice", "age": 30}'));
--
func.json_typeof(func.parse_json('{"name": "alice", "age": 30}'))|
-------------------------------------------------------+
object |
14.28 - OBJECT_KEYS
Alias for JSON_OBJECT_KEYS.
14.29 - PARSE_JSON
Interprets input JSON string, producing a VARIANT value
parse_json
and try_parse_json
interprets an input string as a JSON document, producing a VARIANT value.
try_parse_json
returns a NULL value if an error occurs during parsing.
Analyze Syntax
func.parse_json(<json_string>)
or
func.try_parse_json(<json_string>)
Analyze Example
func.parse_json('[-1, 12, 289, 2188, false]')
+-----------------------------------------------+
| func.parse_json('[-1, 12, 289, 2188, false]') |
+-----------------------------------------------+
| [-1,12,289,2188,false] |
+-----------------------------------------------+
func.try_parse_json('{ "x" : "abc", "y" : false, "z": 10} ')
+--------------------------------------------------------------+
| func.try_parse_json('{ "x" : "abc", "y" : false, "z": 10} ') |
+--------------------------------------------------------------+
| {"x":"abc","y":false,"z":10} |
+--------------------------------------------------------------+
SQL Syntax
PARSE_JSON(<expr>)
TRY_PARSE_JSON(<expr>)
Arguments
Arguments | Description |
---|
<expr> | An expression of string type (e.g. VARCHAR) that holds valid JSON information. |
Return Type
VARIANT
SQL Examples
SELECT parse_json('[-1, 12, 289, 2188, false]');
+------------------------------------------+
| parse_json('[-1, 12, 289, 2188, false]') |
+------------------------------------------+
| [-1,12,289,2188,false] |
+------------------------------------------+
SELECT try_parse_json('{ "x" : "abc", "y" : false, "z": 10} ');
+---------------------------------------------------------+
| try_parse_json('{ "x" : "abc", "y" : false, "z": 10} ') |
+---------------------------------------------------------+
| {"x":"abc","y":false,"z":10} |
+---------------------------------------------------------+
15 - SLEEP
Sleeps seconds
seconds on each data block.
!!! warning
Only used for testing where sleep is required.
SQL Syntax
Arguments
Arguments | Description |
---|
seconds | Must be a constant column of any nonnegative number or float.| |
Return Type
UInt8
SQL Examples
SELECT sleep(2);
+----------+
| sleep(2) |
+----------+
| 0 |
+----------+
16 - String Functions
This section provides reference information for the string-related functions in PlaidCloud Lakehouse.
String Manipulation:
Case Conversion:
Regular Expressions:
Encoding and Decoding:
Miscellaneous:
16.1 - ASCII
Returns the numeric value of the leftmost character of the string str.
Analyze Syntax
Analyze Examples
func.ascii('2')
+-----------------+
| func.ascii('2') |
+-----------------+
| 50 |
+-----------------+
SQL Syntax
Arguments
Arguments | Description |
---|
<expr> | The string. |
Return Type
TINYINT
SQL Examples
SELECT ASCII('2');
+------------+
| ASCII('2') |
+------------+
| 50 |
+------------+
16.2 - BIN
Returns a string representation of the binary value of N.
Analyze Syntax
Analyze Examples
func.bin(12)
+--------------+
| func.bin(12) |
+--------------+
| 1100 |
+--------------+
SQL Syntax
Arguments
Arguments | Description |
---|
<expr> | The number. |
Return Type
VARCHAR
SQL Examples
SELECT BIN(12);
+---------+
| BIN(12) |
+---------+
| 1100 |
+---------+
16.3 - BIT_LENGTH
Return the length of a string in bits.
Analyze Syntax
Analyze Examples
func.bit_length('Word')
+-------------------------+
| func.bit_length('Word') |
+-------------------------+
| 32 |
+-------------------------+
SQL Syntax
Arguments
Arguments | Description |
---|
<expr> | The string. |
Return Type
BIGINT
SQL Examples
SELECT BIT_LENGTH('Word');
+----------------------------+
| SELECT BIT_LENGTH('Word'); |
+----------------------------+
| 32 |
+----------------------------+
16.4 - CHAR
Return the character for each integer passed.
Analyze Syntax
Analyze Examples
func.char(77,121,83,81,76)
+-----------------------------+
| func.char(77,121,83,81,76) |
+-----------------------------+
| 4D7953514C |
+-----------------------------+
SQL Syntax
Arguments
Arguments | Description |
---|
N | Numeric Column |
Return Type
BINARY
SQL Examples
This example shows both the binary value returned as well as the string representation.
SELECT CHAR(77,121,83,81,76) as a, a::String;
┌────────────────────────┐
│ a │ a::string │
│ Binary │ String │
├────────────┼───────────┤
│ 4D7953514C │ MySQL │
└────────────────────────┘
16.5 - CHAR_LENGTH
Alias for LENGTH.
16.6 - CHARACTER_LENGTH
Alias for LENGTH.
16.7 - CONCAT
Returns the string that results from concatenating the arguments. May have one or more arguments. If all arguments are nonbinary strings, the result is a nonbinary string. If the arguments include any binary strings, the result is a binary string. A numeric argument is converted to its equivalent nonbinary string form.
Analyze Syntax
func.concat(<expr1>, ...)
Analyze Examples
func.concat('data', 'bend')
+-----------------------------+
| func.concat('data', 'bend') |
+-----------------------------+
| databend |
+-----------------------------+
SQL Syntax
Arguments
Arguments | Description |
---|
<expr1> | string |
Return Type
A VARCHAR
data type value Or NULL
data type.
SQL Examples
SELECT CONCAT('data', 'bend');
+------------------------+
| concat('data', 'bend') |
+------------------------+
| databend |
+------------------------+
SELECT CONCAT('data', NULL, 'bend');
+------------------------------+
| CONCAT('data', NULL, 'bend') |
+------------------------------+
| NULL |
+------------------------------+
SELECT CONCAT('14.3');
+----------------+
| concat('14.3') |
+----------------+
| 14.3 |
+----------------+
16.8 - CONCAT_WS
CONCAT_WS() stands for Concatenate With Separator and is a special form of CONCAT(). The first argument is the separator for the rest of the arguments. The separator is added between the strings to be concatenated. The separator can be a string, as can the rest of the arguments. If the separator is NULL, the result is NULL.
CONCAT_WS() does not skip empty strings. However, it does skip any NULL values after the separator argument.
Analyze Syntax
func.concat_ws(<separator>, <expr1>, ...)
Analyze Examples
func.concat_ws(',', 'data', 'fuse', 'labs', '2021')
+-----------------------------------------------------+
| func.concat_ws(',', 'data', 'fuse', 'labs', '2021') |
+-----------------------------------------------------+
| data,fuse,labs,2021 |
+-----------------------------------------------------+
SQL Syntax
CONCAT_WS(<separator>, <expr1>, ...)
Arguments
Arguments | Description |
---|
<separator> | string column |
<expr1> | value column |
Return Type
A VARCHAR
data type value Or NULL
data type.
SQL Examples
SELECT CONCAT_WS(',', 'data', 'fuse', 'labs', '2021');
+------------------------------------------------+
| CONCAT_WS(',', 'data', 'fuse', 'labs', '2021') |
+------------------------------------------------+
| data,fuse,labs,2021 |
+------------------------------------------------+
SELECT CONCAT_WS(',', 'data', NULL, 'bend');
+--------------------------------------+
| CONCAT_WS(',', 'data', NULL, 'bend') |
+--------------------------------------+
| data,bend |
+--------------------------------------+
SELECT CONCAT_WS(',', 'data', NULL, NULL, 'bend');
+--------------------------------------------+
| CONCAT_WS(',', 'data', NULL, NULL, 'bend') |
+--------------------------------------------+
| data,bend |
+--------------------------------------------+
SELECT CONCAT_WS(NULL, 'data', 'fuse', 'labs');
+-----------------------------------------+
| CONCAT_WS(NULL, 'data', 'fuse', 'labs') |
+-----------------------------------------+
| NULL |
+-----------------------------------------+
SELECT CONCAT_WS(',', NULL);
+----------------------+
| CONCAT_WS(',', NULL) |
+----------------------+
| |
+----------------------+
16.9 - FROM_BASE64
Takes a string encoded with the base-64 encoded rules nd returns the decoded result as a binary.
The result is NULL if the argument is NULL or not a valid base-64 string.
Analyze Syntax
Analyze Examples
func.from_base64('YWJj')
+--------------------------+
| func.from_base64('YWJj') |
+--------------------------+
| abc |
+--------------------------+
SQL Syntax
Arguments
Arguments | Description |
---|
<expr> | The string value. |
Return Type
BINARY
SQL Examples
SELECT TO_BASE64('abc'), FROM_BASE64(TO_BASE64('abc')) as b, b::String;
┌───────────────────────────────────────┐
│ to_base64('abc') │ b │ b::string │
│ String │ Binary │ String │
├──────────────────┼────────┼───────────┤
│ YWJj │ 616263 │ abc │
└───────────────────────────────────────┘
16.10 - FROM_HEX
Alias for UNHEX.
16.11 - HEX
Alias for TO_HEX.
16.12 - INSERT
Returns the string str, with the substring beginning at position pos and len characters long replaced by the string newstr. Returns the original string if pos is not within the length of the string. Replaces the rest of the string from position pos if len is not within the length of the rest of the string. Returns NULL if any argument is NULL.
Analyze Syntax
func.insert(<str>, <pos>, <len>, <newstr>)
Analyze Examples
func.insert('Quadratic', 3, 4, 'What')
+----------------------------------------+
| func.insert('Quadratic', 3, 4, 'What') |
+----------------------------------------+
| QuWhattic |
+----------------------------------------+
SQL Syntax
INSERT(<str>, <pos>, <len>, <newstr>)
Arguments
Arguments | Description |
---|
<str> | The string. |
<pos> | The position. |
<len> | The length. |
<newstr> | The new string. |
Return Type
VARCHAR
SQL Examples
SELECT INSERT('Quadratic', 3, 4, 'What');
+-----------------------------------+
| INSERT('Quadratic', 3, 4, 'What') |
+-----------------------------------+
| QuWhattic |
+-----------------------------------+
SELECT INSERT('Quadratic', -1, 4, 'What');
+---------------------------------------+
| INSERT('Quadratic', (- 1), 4, 'What') |
+---------------------------------------+
| Quadratic |
+---------------------------------------+
SELECT INSERT('Quadratic', 3, 100, 'What');
+-------------------------------------+
| INSERT('Quadratic', 3, 100, 'What') |
+-------------------------------------+
| QuWhat |
+-------------------------------------+
+--------------------------------------------+--------+
| INSERT('123456789', number, number, 'aaa') | number |
+--------------------------------------------+--------+
| 123456789 | 0 |
| aaa23456789 | 1 |
| 1aaa456789 | 2 |
| 12aaa6789 | 3 |
| 123aaa89 | 4 |
| 1234aaa | 5 |
| 12345aaa | 6 |
| 123456aaa | 7 |
| 1234567aaa | 8 |
| 12345678aaa | 9 |
| 123456789 | 10 |
| 123456789 | 11 |
| 123456789 | 12 |
+--------------------------------------------+--------+
16.13 - INSTR
Returns the position of the first occurrence of substring substr in string str.
This is the same as the two-argument form of LOCATE(), except that the order of the arguments is reversed.
Analyze Syntax
func.instr(<str>, <substr>)
Analyze Examples
func.instr('foobarbar', 'bar')
+--------------------------------+
| func.instr('foobarbar', 'bar') |
+--------------------------------+
| 4 |
+--------------------------------+
SQL Syntax
Arguments
Arguments | Description |
---|
<str> | The string. |
<substr> | The substring. |
Return Type
BIGINT
SQL Examples
SELECT INSTR('foobarbar', 'bar');
+---------------------------+
| INSTR('foobarbar', 'bar') |
+---------------------------+
| 4 |
+---------------------------+
SELECT INSTR('xbar', 'foobar');
+-------------------------+
| INSTR('xbar', 'foobar') |
+-------------------------+
| 0 |
+-------------------------+
16.14 - LCASE
Alias for LOWER.
16.15 - LEFT
Returns the leftmost len characters from the string str, or NULL if any argument is NULL.
Analyze Syntax
Analyze Examples
func.left('foobarbar', 5)
+---------------------------+
| func.left('foobarbar', 5) |
+---------------------------+
| fooba |
+---------------------------+
SQL Syntax
Arguments
Arguments | Description |
---|
<str> | The main string from where the character to be extracted |
<len> | The count of characters |
Return Type
VARCHAR
SQL Examples
SELECT LEFT('foobarbar', 5);
+----------------------+
| LEFT('foobarbar', 5) |
+----------------------+
| fooba |
+----------------------+
16.16 - LENGTH
Returns the length of a given input string or binary value. In the case of strings, the length represents the count of characters, with each UTF-8 character considered as a single character. For binary data, the length corresponds to the number of bytes.
Analyze Syntax
Analyze Examples
func.length('Hello')
+----------------------+
| func.length('Hello') |
+----------------------+
| 5 |
+----------------------+
SQL Syntax
Aliases
Return Type
BIGINT
SQL Examples
SELECT LENGTH('Hello'), LENGTH_UTF8('Hello'), CHAR_LENGTH('Hello'), CHARACTER_LENGTH('Hello');
┌───────────────────────────────────────────────────────────────────────────────────────────┐
│ length('hello') │ length_utf8('hello') │ char_length('hello') │ character_length('hello') │
├─────────────────┼──────────────────────┼──────────────────────┼───────────────────────────┤
│ 5 │ 5 │ 5 │ 5 │
└───────────────────────────────────────────────────────────────────────────────────────────┘
16.17 - LENGTH_UTF8
Alias for LENGTH.
16.18 - LIKE
Pattern matching using an SQL pattern. Returns 1 (TRUE) or 0 (FALSE). If either expr or pat is NULL, the result is NULL.
Analyze Syntax
Analyze Examples
my_clothes.like('plaid%)
+-----------------+
| my_clothes |
+-----------------+
| plaid pants |
| plaid hat |
| plaid shirt |
+-----------------+
SQL Syntax
SQL Examples
SELECT name, category FROM system.functions WHERE name like 'tou%' ORDER BY name;
+----------+------------+
| name | category |
+----------+------------+
| touint16 | conversion |
| touint32 | conversion |
| touint64 | conversion |
| touint8 | conversion |
+----------+------------+
16.19 - LOCATE
The first syntax returns the position of the first occurrence of substring substr in string str.
The second syntax returns the position of the first occurrence of substring substr in string str, starting at position pos.
Returns 0 if substr is not in str. Returns NULL if any argument is NULL.
Analyze Syntax
func.locate(<substr>, <str>, <pos>)
Analyze Examples
func.locate('bar', 'foobarbar')
+------------------------------------+
| func.locate('bar', 'foobarbar') |
+------------------------------------+
| 5 |
+------------------------------------+
func.locate('bar', 'foobarbar', 5)
+------------------------------------+
| func.locate('bar', 'foobarbar', 5) |
+------------------------------------+
| 7 |
+------------------------------------+
SQL Syntax
LOCATE(<substr>, <str>)
LOCATE(<substr>, <str>, <pos>)
Arguments
Arguments | Description |
---|
<substr> | The substring. |
<str> | The string. |
<pos> | The position. |
Return Type
BIGINT
SQL Examples
SELECT LOCATE('bar', 'foobarbar')
+----------------------------+
| LOCATE('bar', 'foobarbar') |
+----------------------------+
| 4 |
+----------------------------+
SELECT LOCATE('xbar', 'foobar')
+--------------------------+
| LOCATE('xbar', 'foobar') |
+--------------------------+
| 0 |
+--------------------------+
SELECT LOCATE('bar', 'foobarbar', 5)
+-------------------------------+
| LOCATE('bar', 'foobarbar', 5) |
+-------------------------------+
| 7 |
+-------------------------------+
16.20 - LOWER
Returns a string with all characters changed to lowercase.
Analyze Syntax
Analyze Examples
func.lower('Hello, PlaidCloud!')
+----------------------------------+
| func.lower('Hello, PlaidCloud!') |
+----------------------------------+
| hello, plaidcloud! |
+----------------------------------+
SQL Syntax
Aliases
Return Type
VARCHAR
SQL Examples
SELECT LOWER('Hello, Databend!'), LCASE('Hello, Databend!');
┌───────────────────────────────────────────────────────┐
│ lower('hello, databend!') │ lcase('hello, databend!') │
├───────────────────────────┼───────────────────────────┤
│ hello, databend! │ hello, databend! │
└───────────────────────────────────────────────────────┘
16.21 - LPAD
Returns the string str, left-padded with the string padstr to a length of len characters.
If str is longer than len, the return value is shortened to len characters.
Analyze Syntax
func.lpad(<str>, <len>, <padstr>)
Analyze Examples
func.lpad('hi',4,'??')
+------------------------+
| func.lpad('hi',4,'??') |
+------------------------+
| ??hi |
+------------------------+
func.lpad('hi',1,'??')
+------------------------+
| func.lpad('hi',1,'??') |
+------------------------+
| h |
+------------------------+
SQL Syntax
LPAD(<str>, <len>, <padstr>)
Arguments
Arguments | Description |
---|
<str> | The string. |
<len> | The length. |
<padstr> | The pad string. |
Return Type
VARCHAR
SQL Examples
SELECT LPAD('hi',4,'??');
+---------------------+
| LPAD('hi', 4, '??') |
+---------------------+
| ??hi |
+---------------------+
SELECT LPAD('hi',1,'??');
+---------------------+
| LPAD('hi', 1, '??') |
+---------------------+
| h |
+---------------------+
16.22 - MID
Alias for SUBSTR.
16.23 - NOT LIKE
Pattern not matching using an SQL pattern. Returns 1 (TRUE) or 0 (FALSE). If either expr or pat is NULL, the result is NULL.
Analyze Syntax
<column>.not_like(<pattern>)
Analyze Examples
my_clothes.not_like('%pants)
+-----------------+
| my_clothes |
+-----------------+
| plaid pants XL |
| plaid hat |
| plaid shirt |
+-----------------+
SQL Syntax
<expr> NOT LIKE <pattern>
SQL Examples
SELECT name, category FROM system.functions WHERE name like 'tou%' AND name not like '%64' ORDER BY name;
+----------+------------+
| name | category |
+----------+------------+
| touint16 | conversion |
| touint32 | conversion |
| touint8 | conversion |
+----------+------------+
16.24 - NOT REGEXP
Returns 1 if the string expr doesn't match the regular expression specified by the pattern pat, 0 otherwise.
Analyze Syntax
not_(<column>.regexp_match(<pattern>))
Analyze Examples
With an input table of:
+-----------------+
| my_clothes |
+-----------------+
| plaid pants |
| plaid hat |
| plaid shirt |
| shoes |
+-----------------+
not_(my_clothes.regexp_match('p*'))
+-------------------------------------+
| not_(my_clothes.regexp_match('p*')) |
+-------------------------------------+
| false |
| false |
| false |
| true |
+-------------------------------------+
SQL Syntax
<expr> NOT REGEXP <pattern>
SQL Examples
SELECT 'databend' NOT REGEXP 'd*';
+------------------------------+
| ('databend' not regexp 'd*') |
+------------------------------+
| 0 |
+------------------------------+
16.25 - NOT RLIKE
Returns 1 if the string expr doesn't match the regular expression specified by the pattern pat, 0 otherwise.
Analyze Syntax
not_(<column>.regexp_match(<pattern>))
Analyze Examples
With an input table of:
+-----------------+
| my_clothes |
+-----------------+
| plaid pants |
| plaid hat |
| plaid shirt |
| shoes |
+-----------------+
not_(my_clothes.regexp_match('p*'))
+-------------------------------------+
| not_(my_clothes.regexp_match('p*')) |
+-------------------------------------+
| false |
| false |
| false |
| true |
+-------------------------------------+
SQL Syntax
<expr> NOT RLIKE <pattern>
SQL Examples
SELECT 'databend' not rlike 'd*';
+-----------------------------+
| ('databend' not rlike 'd*') |
+-----------------------------+
| 0 |
+-----------------------------+
16.26 - OCT
Returns a string representation of the octal value of N.
Analyze Syntax
Analyze Examples
func.oct(12)
+-----------------+
| func.oct(12) |
+-----------------+
| 014 |
+-----------------+
SQL Syntax
SQL Examples
SELECT OCT(12);
+---------+
| OCT(12) |
+---------+
| 014 |
+---------+
16.27 - OCTET_LENGTH
OCTET_LENGTH() is a synonym for LENGTH().
Analyze Syntax
Analyze Examples
func.octet_length('databend')
+-------------------------------+
| func.octet_length('databend') |
+-------------------------------+
| 8 |
+-------------------------------+
SQL Syntax
SQL Examples
SELECT OCTET_LENGTH('databend');
+--------------------------+
| OCTET_LENGTH('databend') |
+--------------------------+
| 8 |
+--------------------------+
16.28 - ORD
If the leftmost character is not a multibyte character, ORD() returns the same value as the ASCII() function.
If the leftmost character of the string str is a multibyte character, returns the code for that character,
calculated from the numeric values of its constituent bytes using this formula:
(1st byte code)
+ (2nd byte code * 256)
+ (3rd byte code * 256^2) ...
Analyze Syntax
Analyze Examples
func.ord('2')
+----------------+
| func.ord('2) |
+----------------+
| 50 |
+----------------+
SQL Syntax
Arguments
Arguments | Description |
---|
<str> | The string. |
Return Type
BIGINT
SQL Examples
SELECT ORD('2')
+--------+
| ORD(2) |
+--------+
| 50 |
+--------+
16.29 - POSITION
POSITION(substr IN str) is a synonym for LOCATE(substr,str).
Returns the position of the first occurrence of substring substr in string str.
Returns 0 if substr is not in str. Returns NULL if any argument is NULL.
Analyze Syntax
func.position(<substr>, <str>)
Analyze Examples
func.position('bar', 'foobarbar')
+-----------------------------------+
| func.position('bar', 'foobarbar') |
+-----------------------------------+
| 4 |
+-----------------------------------+
SQL Syntax
POSITION(<substr> IN <str>)
Arguments
Arguments | Description |
---|
<substr> | The substring. |
<str> | The string. |
Return Type
BIGINT
SQL Examples
SELECT POSITION('bar' IN 'foobarbar')
+----------------------------+
| POSITION('bar' IN 'foobarbar') |
+----------------------------+
| 4 |
+----------------------------+
SELECT POSITION('xbar' IN 'foobar')
+--------------------------+
| POSITION('xbar' IN 'foobar') |
+--------------------------+
| 0 |
+--------------------------+
16.30 - QUOTE
Quotes a string to produce a result that can be used as a properly escaped data value in an SQL statement.
Analyze Syntax
Analyze Examples
func.quote('Don\'t')
+----------------------+
| func.quote('Don\'t') |
+----------------------+
| Don\'t! |
+----------------------+
SQL Syntax
SQL Examples
SELECT QUOTE('Don\'t!');
+-----------------+
| QUOTE('Don't!') |
+-----------------+
| Don\'t! |
+-----------------+
SELECT QUOTE(NULL);
+-------------+
| QUOTE(NULL) |
+-------------+
| NULL |
+-------------+
16.31 - REGEXP
Returns true
if the string <expr>
matches the regular expression specified by the <pattern>
, false
otherwise.
Analyze Syntax
<column>.regexp_match(<pattern>)
Analyze Examples
With an input table of:
+-----------------+
| my_clothes |
+-----------------+
| plaid pants |
| plaid hat |
| plaid shirt |
| shoes |
+-----------------+
my_clothes.regexp_match('p*')
+-------------------------------+
| my_clothes.regexp_match('p*') |
+-------------------------------+
| true |
| true |
| true |
| false |
+-------------------------------+
SQL Syntax
Aliases
SQL Examples
SELECT 'databend' REGEXP 'd*', 'databend' RLIKE 'd*';
┌────────────────────────────────────────────────────┐
│ ('databend' regexp 'd*') │ ('databend' rlike 'd*') │
├──────────────────────────┼─────────────────────────┤
│ true │ true │
└────────────────────────────────────────────────────┘
16.32 - REGEXP_INSTR
Returns the starting index of the substring of the string expr
that matches the regular expression specified by the pattern pat
, 0
if there is no match. If expr
or pat
is NULL, the return value is NULL. Character indexes begin at 1
.
Analyze Syntax
func.regexp_instr(<expr>, <pat[, pos[, occurrence[, return_option[, match_type]]]]>)
Analyze Examples
func.regexp_instr('dog cat dog', 'dog')
+-----------------------------------------+
| func.regexp_instr('dog cat dog', 'dog') |
+-----------------------------------------+
| 1 |
+-----------------------------------------+
SQL Syntax
REGEXP_INSTR(<expr>, <pat[, pos[, occurrence[, return_option[, match_type]]]]>)
Arguments
Arguments | Description |
---|
expr | The string expr that to be matched |
pat | The regular expression |
pos | Optional. The position in expr at which to start the search. If omitted, the default is 1. |
occurrence | Optional. Which occurrence of a match to search for. If omitted, the default is 1. |
return_option | Optional. Which type of position to return. If this value is 0, REGEXP_INSTR() returns the position of the matched substring's first character. If this value is 1, REGEXP_INSTR() returns the position following the matched substring. If omitted, the default is 0. |
match_type | Optional. A string that specifies how to perform matching. The meaning is as described for REGEXP_LIKE(). |
Return Type
A number data type value.
SQL Examples
SELECT REGEXP_INSTR('dog cat dog', 'dog');
+------------------------------------+
| REGEXP_INSTR('dog cat dog', 'dog') |
+------------------------------------+
| 1 |
+------------------------------------+
SELECT REGEXP_INSTR('dog cat dog', 'dog', 2);
+---------------------------------------+
| REGEXP_INSTR('dog cat dog', 'dog', 2) |
+---------------------------------------+
| 9 |
+---------------------------------------+
SELECT REGEXP_INSTR('aa aaa aaaa', 'a{2}');
+-------------------------------------+
| REGEXP_INSTR('aa aaa aaaa', 'a{2}') |
+-------------------------------------+
| 1 |
+-------------------------------------+
SELECT REGEXP_INSTR('aa aaa aaaa', 'a{4}');
+-------------------------------------+
| REGEXP_INSTR('aa aaa aaaa', 'a{4}') |
+-------------------------------------+
| 8 |
+-------------------------------------+
16.33 - REGEXP_LIKE
REGEXP_LIKE function is used to check that whether the string matches the regular expression.
Analyze Syntax
func.regexp_like(<expr>, <pat[, match_type]>)
Analyze Examples
func.regexp_like('a', '^[a-d]')
+---------------------------------+
| func.regexp_like('a', '^[a-d]') |
+---------------------------------+
| 1 |
+---------------------------------+
SQL Syntax
REGEXP_LIKE(<expr>, <pat[, match_type]>)
Arguments
Arguments | Description |
---|
<expr> | The string expr that to be matched |
<pat> | The regular expression |
[match_type] | Optional. match_type argument is a string that specifying how to perform matching |
match_type
may contain any or all the following characters:
c
: Case-sensitive matching.i
: Case-insensitive matching.m
: Multiple-line mode. Recognize line terminators within the string. The default behavior is to match line terminators only at the start and end of the string expression.n
: The .
character matches line terminators. The default is for .
matching to stop at the end of a line.u
: Unix-only line endings. Not be supported now.
Return Type
BIGINT
Returns 1
if the string expr matches the regular expression specified by the pattern pat, 0
otherwise. If expr or pat is NULL, the return value is NULL.
SQL Examples
SELECT REGEXP_LIKE('a', '^[a-d]');
+----------------------------+
| REGEXP_LIKE('a', '^[a-d]') |
+----------------------------+
| 1 |
+----------------------------+
SELECT REGEXP_LIKE('abc', 'ABC');
+---------------------------+
| REGEXP_LIKE('abc', 'ABC') |
+---------------------------+
| 1 |
+---------------------------+
SELECT REGEXP_LIKE('abc', 'ABC', 'c');
+--------------------------------+
| REGEXP_LIKE('abc', 'ABC', 'c') |
+--------------------------------+
| 0 |
+--------------------------------+
SELECT REGEXP_LIKE('new*\n*line', 'new\\*.\\*line');
+-------------------------------------------+
| REGEXP_LIKE('new*
*line', 'new\*.\*line') |
+-------------------------------------------+
| 0 |
+-------------------------------------------+
SELECT REGEXP_LIKE('new*\n*line', 'new\\*.\\*line', 'n');
+------------------------------------------------+
| REGEXP_LIKE('new*
*line', 'new\*.\*line', 'n') |
+------------------------------------------------+
| 1 |
+------------------------------------------------+
16.34 - REGEXP_REPLACE
Replaces occurrences in the string expr
that match the regular expression specified by the pattern pat
with the replacement string repl
, and returns the resulting string. If expr
, pat
, or repl
is NULL, the return value is NULL.
Analyze Syntax
func.regexp_replace(<expr>, <pat>, <repl[, pos[, occurrence[, match_type]]]>)
Analyze Examples
func.regexp_replace('a b c', 'b', 'X')
+----------------------------------------+
| func.regexp_replace('a b c', 'b', 'X') |
+----------------------------------------+
| a X c |
+----------------------------------------+
SQL Syntax
REGEXP_REPLACE(<expr>, <pat>, <repl[, pos[, occurrence[, match_type]]]>)
Arguments
Arguments | Description |
---|
expr | The string expr that to be matched |
pat | The regular expression |
repl | The replacement string |
pos | Optional. The position in expr at which to start the search. If omitted, the default is 1. |
occurrence | Optional. Which occurrence of a match to replace. If omitted, the default is 0 (which means "replace all occurrences"). |
match_type | Optional. A string that specifies how to perform matching. The meaning is as described for REGEXP_LIKE(). |
Return Type
VARCHAR
SQL Examples
SELECT REGEXP_REPLACE('a b c', 'b', 'X');
+-----------------------------------+
| REGEXP_REPLACE('a b c', 'b', 'X') |
+-----------------------------------+
| a X c |
+-----------------------------------+
SELECT REGEXP_REPLACE('abc def ghi', '[a-z]+', 'X', 1, 3);
+----------------------------------------------------+
| REGEXP_REPLACE('abc def ghi', '[a-z]+', 'X', 1, 3) |
+----------------------------------------------------+
| abc def X |
+----------------------------------------------------+
SELECT REGEXP_REPLACE('周 周周 周周周', '周+', 'X', 3, 2);
+-----------------------------------------------------------+
| REGEXP_REPLACE('周 周周 周周周', '周+', 'X', 3, 2) |
+-----------------------------------------------------------+
| 周 周周 X |
+-----------------------------------------------------------+
16.35 - REGEXP_SUBSTR
Returns the substring of the string expr
that matches the regular expression specified by the pattern pat
, NULL if there is no match. If expr or pat is NULL, the return value is NULL.
Analyze Syntax
func.regexp_substr(<expr>, <pat[, pos[, occurrence[, match_type]]]>)
Analyze Examples
func.regexp_substr('abc def ghi', '[a-z]+')
+---------------------------------------------+
| func.regexp_substr('abc def ghi', '[a-z]+') |
+---------------------------------------------+
| abc |
+---------------------------------------------+
SQL Syntax
REGEXP_SUBSTR(<expr>, <pat[, pos[, occurrence[, match_type]]]>)
Arguments
Arguments | Description |
---|
expr | The string expr that to be matched |
pat | The regular expression |
pos | Optional. The position in expr at which to start the search. If omitted, the default is 1. |
occurrence | Optional. Which occurrence of a match to search for. If omitted, the default is 1. |
match_type | Optional. A string that specifies how to perform matching. The meaning is as described for REGEXP_LIKE(). |
Return Type
VARCHAR
SQL Examples
SELECT REGEXP_SUBSTR('abc def ghi', '[a-z]+');
+----------------------------------------+
| REGEXP_SUBSTR('abc def ghi', '[a-z]+') |
+----------------------------------------+
| abc |
+----------------------------------------+
SELECT REGEXP_SUBSTR('abc def ghi', '[a-z]+', 1, 3);
+----------------------------------------------+
| REGEXP_SUBSTR('abc def ghi', '[a-z]+', 1, 3) |
+----------------------------------------------+
| ghi |
+----------------------------------------------+
SELECT REGEXP_SUBSTR('周 周周 周周周 周周周周', '周+', 2, 3);
+------------------------------------------------------------------+
| REGEXP_SUBSTR('周 周周 周周周 周周周周', '周+', 2, 3) |
+------------------------------------------------------------------+
| 周周周周 |
+------------------------------------------------------------------+
16.36 - REPEAT
Returns a string consisting of the string str repeated count times. If count is less than 1, returns an empty string. Returns NULL if str or count are NULL.
Analyze Syntax
func.repeat(<str>, <count>)
Analyze Examples
func.repeat(<str>, <count>)
+-------------------------+
| func.repeat('plaid', 3) |
+-------------------------+
| plaidplaidplaid |
+-------------------------+
SQL Syntax
Arguments
Arguments | Description |
---|
<str> | The string. |
<count> | The number. |
SQL Examples
SELECT REPEAT('databend', 3);
+--------------------------+
| REPEAT('databend', 3) |
+--------------------------+
| databenddatabenddatabend |
+--------------------------+
SELECT REPEAT('databend', 0);
+-----------------------+
| REPEAT('databend', 0) |
+-----------------------+
| |
+-----------------------+
SELECT REPEAT('databend', NULL);
+--------------------------+
| REPEAT('databend', NULL) |
+--------------------------+
| NULL |
+--------------------------+
16.37 - REPLACE
Returns the string str with all occurrences of the string from_str replaced by the string to_str.
Analyze Syntax
func.replace(<str>, <from_str>, <to_str>)
Analyze Examples
func.replace(<str>, <from_str>, <to_str>)
+--------------------------------------+
| func.replace('plaidCloud', 'p', 'P') |
+--------------------------------------+
| PlaidCloud |
+--------------------------------------+
SQL Syntax
REPLACE(<str>, <from_str>, <to_str>)
Arguments
Arguments | Description |
---|
<str> | The string. |
<from_str> | The from string. |
<to_str> | The to string. |
Return Type
VARCHAR
SQL Examples
SELECT REPLACE('www.mysql.com', 'w', 'Ww');
+-------------------------------------+
| REPLACE('www.mysql.com', 'w', 'Ww') |
+-------------------------------------+
| WwWwWw.mysql.com |
+-------------------------------------+
16.38 - REVERSE
Returns the string str with the order of the characters reversed.
Analyze Syntax
Analyze Examples
func.reverse('abc')
+----------------------+
| func..reverse('abc') |
+----------------------+
| cba |
+----------------------+
SQL Syntax
Arguments
Arguments | Description |
---|
<str> | The string value. |
Return Type
VARCHAR
SQL Examples
SELECT REVERSE('abc');
+----------------+
| REVERSE('abc') |
+----------------+
| cba |
+----------------+
16.39 - RIGHT
Returns the rightmost len characters from the string str, or NULL if any argument is NULL.
Analyze Syntax
Analyze Examples
func.right('foobarbar', 4)
+----------------------------+
| func.right('foobarbar', 4) |
+----------------------------+
| rbar |
+----------------------------+
SQL Syntax
Arguments
Arguments | Description |
---|
<str> | The main string from where the character to be extracted |
<len> | The count of characters |
Return Type
VARCHAR
SQL Examples
SELECT RIGHT('foobarbar', 4);
+-----------------------+
| RIGHT('foobarbar', 4) |
+-----------------------+
| rbar |
+-----------------------+
16.40 - RLIKE
Alias for REGEXP.
16.41 - RPAD
Returns the string str, right-padded with the string padstr to a length of len characters.
If str is longer than len, the return value is shortened to len characters.
Analyze Syntax
func.rpad(<str>, <len>, <padstr>)
Analyze Examples
func.rpad('hi',5,'?')
+-----------------------+
| func.rpad('hi',5,'?') |
+-----------------------+
| hi??? |
+-----------------------+
func.rpad('hi',1,'?')
+-----------------------+
| func.rpad('hi',1,'?') |
+-----------------------+
| h |
+-----------------------+
SQL Syntax
RPAD(<str>, <len>, <padstr>)
Arguments
Arguments | Description |
---|
<str> | The string. |
<len> | The length. |
<padstr> | The pad string. |
Return Type
VARCHAR
SQL Examples
SELECT RPAD('hi',5,'?');
+--------------------+
| RPAD('hi', 5, '?') |
+--------------------+
| hi??? |
+--------------------+
SELECT RPAD('hi',1,'?');
+--------------------+
| RPAD('hi', 1, '?') |
+--------------------+
| h |
+--------------------+
16.42 - SOUNDEX
Generates the Soundex code for a string.
- A Soundex code consists of a letter followed by three numerical digits. PlaidCloud Lakehouse's implementation returns more than 4 digits, but you can SUBSTR the result to get a standard Soundex code.
- All non-alphabetic characters in the string are ignored.
- All international alphabetic characters outside the A-Z range are ignored unless they're the first letter.
Note: What is Soundex?
Soundex converts an alphanumeric string to a four-character code that is based on how the string sounds when spoken in English. For more information, see
https://en.wikipedia.org/wiki/Soundex See also: SOUNDS LIKE
Analyze Syntax
Analyze Examples
func.soundex('PlaidCloud Lakehouse')
+--------------------------------------+
| func.soundex('PlaidCloud Lakehouse') |
+--------------------------------------+
| D153 |
+--------------------------------------+
SQL Syntax
Arguments
Arguments | Description |
---|
str | The string. |
Return Type
Returns a code of type VARCHAR or a NULL value.
SQL Examples
SELECT SOUNDEX('PlaidCloud Lakehouse');
---
D153
-- All non-alphabetic characters in the string are ignored.
SELECT SOUNDEX('PlaidCloud Lakehouse!');
---
D153
-- All international alphabetic characters outside the A-Z range are ignored unless they're the first letter.
SELECT SOUNDEX('PlaidCloud Lakehouse,你好');
---
D153
SELECT SOUNDEX('你好,PlaidCloud Lakehouse');
---
你3153
-- SUBSTR the result to get a standard Soundex code.
SELECT SOUNDEX('databend cloud'),SUBSTR(SOUNDEX('databend cloud'),1,4);
soundex('databend cloud')|substring(soundex('databend cloud') from 1 for 4)|
-------------------------+-------------------------------------------------+
D153243 |D153 |
SELECT SOUNDEX(NULL);
+-------------------------------------+
| `SOUNDEX(NULL)` |
+-------------------------------------+
| <null> |
+-------------------------------------+
16.43 - SOUNDS LIKE
Compares the pronunciation of two strings by their Soundex codes. Soundex is a phonetic algorithm that produces a code representing the pronunciation of a string, allowing for approximate matching of strings based on their pronunciation rather than their spelling. PlaidCloud Lakehouse offers the SOUNDEX function that allows you to get the Soundex code from a string.
SOUNDS LIKE is frequently employed in the WHERE clause of SQL queries to narrow down rows using fuzzy string matching, such as for names and addresses, see Filtering Rows in Examples.
Note: While the function can be useful for approximate string matching, it is important to note that it is not always accurate. The Soundex algorithm is based on English pronunciation rules and may not work well for strings from other languages or dialects.
Analyze Syntax
func.sounds_like(<str1>, <str2>)
Analyze Examples
func..sounds_like('Monday', 'Sunday')
+---------------------------------------+
| func..sounds_like('Monday', 'Sunday') |
+---------------------------------------+
| 0 |
+---------------------------------------+
SQL Syntax
<str1> SOUNDS LIKE <str2>
Arguments
Arguments | Description |
---|
str1, 2 | The strings you compare. |
Return Type
Return a Boolean value of 1 if the Soundex codes for the two strings are the same (which means they sound alike) and 0 otherwise.
SQL Examples
Comparing Strings
SELECT 'two' SOUNDS LIKE 'too'
----
1
SELECT CONCAT('A', 'B') SOUNDS LIKE 'AB';
----
1
SELECT 'Monday' SOUNDS LIKE 'Sunday';
----
0
Filtering Rows
SELECT * FROM employees;
id|first_name|last_name|age|
--+----------+---------+---+
0|John |Smith | 35|
0|Mark |Smythe | 28|
0|Johann |Schmidt | 51|
0|Eric |Doe | 30|
0|Sue |Johnson | 45|
SELECT * FROM employees
WHERE first_name SOUNDS LIKE 'John';
id|first_name|last_name|age|
--+----------+---------+---+
0|John |Smith | 35|
0|Johann |Schmidt | 51|
16.44 - SPACE
Returns a string consisting of N blank space characters.
Analyze Syntax
Analyze Examples
func.space(20)
+-----------------+
| func.space(20) |
+-----------------+
| |
+-----------------+
SQL Syntax
Arguments
Arguments | Description |
---|
<n> | The number of spaces |
Return Type
String data type value.
SQL Examples
SELECT SPACE(20)
+----------------------+
| SPACE(20) |
+----------------------+
| |
+----------------------+
16.45 - SPLIT
import FunctionDescription from '@site/src/components/FunctionDescription';
Splits a string using a specified delimiter and returns the resulting parts as an array.
See also: SPLIT_PART
Analyze Syntax
func.split('<input_string>', '<delimiter>')
Analyze Examples
func.split('PlaidCloud Lakehouse', ' ')
+-----------------------------------------+
| func.split('PlaidCloud Lakehouse', ' ') |
+-----------------------------------------+
| ['PlaidCloud Lakehouse'] |
+-----------------------------------------+
SQL Syntax
SPLIT('<input_string>', '<delimiter>')
Return Type
Array of strings. SPLIT returns NULL when either the input string or the delimiter is NULL.
SQL Examples
-- Use a space as the delimiter
-- SPLIT returns an array with two parts.
SELECT SPLIT('PlaidCloud Lakehouse', ' ');
split('PlaidCloud Lakehouse', ' ')|
----------------------------------+
['PlaidCloud','Lakehouse'] |
-- Use an empty string as the delimiter or a delimiter that does not exist in the input string
-- SPLIT returns an array containing the entire input string as a single part.
SELECT SPLIT('PlaidCloud Lakehouse', '');
split('databend cloud', '')|
----------------------------------+
['PlaidCloud Lakehouse'] |
SELECT SPLIT('PlaidCloud Lakehouse', ',');
split('databend cloud', ',')|
----------------------------------+
['PlaidCloud Lakehouse'] |
-- Use ' ' (tab) as the delimiter
-- SPLIT returns an array with timestamp, log level, and message.
SELECT SPLIT('2023-10-19 15:30:45 INFO Log message goes here', ' ');
split('2023-10-19 15:30:45\tinfo\tlog message goes here', '\t')|
---------------------------------------------------------------+
['2023-10-19 15:30:45','INFO','Log message goes here'] |
16.46 - SPLIT_PART
import FunctionDescription from '@site/src/components/FunctionDescription';
Splits a string using a specified delimiter and returns the specified part.
See also: SPLIT
Analyze Syntax
func.split_part('<input_string>', '<delimiter>', '<position>')
Analyze Examples
func.split_part('PlaidCloud Lakehouse', ' ', 1)
+-------------------------------------------------+
| func.split_part('PlaidCloud Lakehouse', ' ', 1) |
+-------------------------------------------------+
| PlaidCloud |
+-------------------------------------------------+
SQL Syntax
SPLIT_PART('<input_string>', '<delimiter>', '<position>')
The position argument specifies which part to return. It uses a 1-based index but can also accept positive, negative, or zero values:
- If position is a positive number, it returns the part at the position from the left to the right, or NULL if it doesn't exist.
- If position is a negative number, it returns the part at the position from the right to the left, or NULL if it doesn't exist.
- If position is 0, it is treated as 1, effectively returning the first part of the string.
Return Type
String. SPLIT_PART returns NULL when either the input string, the delimiter, or the position is NULL.
SQL Examples
-- Use a space as the delimiter
-- SPLIT_PART returns a specific part.
SELECT SPLIT_PART('PlaidCloud Lakehouse', ' ', 1);
split_part('PlaidCloud Lakehouse', ' ', 1)|
------------------------------------------+
PlaidCloud Lakehouse |
-- Use an empty string as the delimiter or a delimiter that does not exist in the input string
-- SPLIT_PART returns the entire input string.
SELECT SPLIT_PART('PlaidCloud Lakehouse', '', 1);
split_part('PlaidCloud Lakehouse', '', 1)|
-----------------------------------+
PlaidCloud Lakehouse |
SELECT SPLIT_PART('PlaidCloud Lakehouse', ',', 1);
split_part('PlaidCloud Lakehouse', ',', 1)|
------------------------------------+
PlaidCloud Lakehouse |
-- Use ' ' (tab) as the delimiter
-- SPLIT_PART returns individual fields.
SELECT SPLIT_PART('2023-10-19 15:30:45 INFO Log message goes here', ' ', 3);
split_part('2023-10-19 15:30:45 info log message goes here', ' ', 3)|
--------------------------------------------------------------------------+
Log message goes here |
-- SPLIT_PART returns an empty string as the specified part does not exist at all.
SELECT SPLIT_PART('2023-10-19 15:30:45 INFO Log message goes here', ' ', 4);
split_part('2023-10-19 15:30:45 info log message goes here', ' ', 4)|
--------------------------------------------------------------------------+
|
16.47 - STRCMP
Returns 0 if the strings are the same, -1 if the first argument is smaller than the second, and 1 otherwise.
Analyze Syntax
func.strcmp(<expr1> ,<expr2>)
Analyze Examples
func.strcmp('text', 'text2')
+------------------------------+
| func.strcmp('text', 'text2') |
+------------------------------+
| -1 |
+------------------------------+
func.strcmp('text2', 'text')
+------------------------------+
| func.strcmp('text2', 'text') |
+------------------------------+
| 1 |
+------------------------------+
func.strcmp('text', 'text')
+------------------------------+
| func.strcmp('text', 'text') |
+------------------------------+
| 0 |
+------------------------------+
SQL Syntax
Arguments
Arguments | Description |
---|
<expr1> | The string. |
<expr2> | The string. |
Return Type
BIGINT
SQL Examples
SELECT STRCMP('text', 'text2');
+-------------------------+
| STRCMP('text', 'text2') |
+-------------------------+
| -1 |
+-------------------------+
SELECT STRCMP('text2', 'text');
+-------------------------+
| STRCMP('text2', 'text') |
+-------------------------+
| 1 |
+-------------------------+
SELECT STRCMP('text', 'text');
+------------------------+
| STRCMP('text', 'text') |
+------------------------+
| 0 |
+------------------------+
16.48 - SUBSTR
Extracts a string containing a specific number of characters from a particular position of a given string.
- The forms without a
len
argument return a substring from string str
starting at position pos
. - The forms with a
len
argument return a substring len
characters long from string str
, starting at position pos
.
It is also possible to use a negative value for pos
. In this case, the beginning of the substring is pos characters from the end of the string, rather than the beginning. A negative value may be used for pos
in any of the forms of this function. A value of 0 for pos
returns an empty string. The position of the first character in the string from which the substring is to be extracted is reckoned as 1.
Analyze Syntax
func.substr(<str>, <pos>, <len>)
Analyze Examples
func.substr('Quadratically', 5, 6)
+------------------------------------+
| func.substr('Quadratically', 5, 6) |
+------------------------------------+
| ratica |
+------------------------------------+
SQL Syntax
SUBSTR(<str>, <pos>)
SUBSTR(<str>, <pos>, <len>)
Arguments
Arguments | Description |
---|
<str> | The main string from where the character to be extracted |
<pos> | The position (starting from 1) the substring to start at. If negative, counts from the end |
<len> | The maximum length of the substring to extract |
Aliases
Return Type
VARCHAR
SQL Examples
SELECT
SUBSTRING('Quadratically', 5),
SUBSTR('Quadratically', 5),
MID('Quadratically', 5);
┌─────────────────────────────────────────────────────────────────────────────────────────────────┐
│ substring('quadratically' from 5) │ substring('quadratically' from 5) │ mid('quadratically', 5) │
├───────────────────────────────────┼───────────────────────────────────┼─────────────────────────┤
│ ratically │ ratically │ ratically │
└─────────────────────────────────────────────────────────────────────────────────────────────────┘
SELECT
SUBSTRING('Quadratically', 5, 6),
SUBSTR('Quadratically', 5, 6),
MID('Quadratically', 5, 6);
┌────────────────────────────────────────────────────────────────────────────────────────────────────────────────┐
│ substring('quadratically' from 5 for 6) │ substring('quadratically' from 5 for 6) │ mid('quadratically', 5, 6) │
├─────────────────────────────────────────┼─────────────────────────────────────────┼────────────────────────────┤
│ ratica │ ratica │ ratica │
└────────────────────────────────────────────────────────────────────────────────────────────────────────────────┘
16.49 - SUBSTRING
Alias for SUBSTR.
16.50 - TO_BASE64
Converts the string argument to base-64 encoded form and returns the result as a character string.
If the argument is not a string, it is converted to a string before conversion takes place.
The result is NULL if the argument is NULL.
Analyze Syntax
Analyze Examples
func.to_base64('abc')
+-----------------------+
| func.to_base64('abc') |
+-----------------------+
| YWJj |
+-----------------------+
SQL Syntax
Arguments
Arguments | Description |
---|
<v> | The value. |
Return Type
VARCHAR
SQL Examples
SELECT TO_BASE64('abc');
+------------------+
| TO_BASE64('abc') |
+------------------+
| YWJj |
+------------------+
16.51 - TRANSLATE
import FunctionDescription from '@site/src/components/FunctionDescription';
Transforms a given string by replacing specific characters with corresponding replacements, as defined by the provided mapping.
Analyze Syntax
func.translate('<inputString>', '<charactersToReplace>', '<replacementCharacters>')
Analyze Examples
func.translate('databend', 'de', 'DE')
+----------------------------------------+
| func.translate('databend', 'de', 'DE') |
+----------------------------------------+
| DatabEnD |
+----------------------------------------+
SQL Syntax
TRANSLATE('<inputString>', '<charactersToReplace>', '<replacementCharacters>')
Parameter | Description |
---|
<inputString> | The input string to be transformed. |
<charactersToReplace> | The string containing characters to be replaced in the input string. |
<replacementCharacters> | The string containing replacement characters corresponding to those in <charactersToReplace> . |
SQL Examples
-- Replace 'd' with '$' in 'databend'
SELECT TRANSLATE('databend', 'd', '$');
---
$ataben$
-- Replace 'd' with 'D' in 'databend'
SELECT TRANSLATE('databend', 'd', 'D');
---
DatabenD
-- Replace 'd' with 'D' and 'e' with 'E' in 'databend'
SELECT TRANSLATE('databend', 'de', 'DE');
---
DatabEnD
-- Remove 'd' from 'databend'
SELECT TRANSLATE('databend', 'd', '');
---
ataben
16.52 - TRIM
Returns the string without leading or trailing occurrences of the specified remove string. If remove string
is omitted, spaces are removed.
The Analyze function automatically trims both leading and trailing spaces.
Analyze Syntax
Analyze Examples
func.trim(' plaidcloud ')
+--------------------------------+
| func.trim(' plaidcloud ') |
+--------------------------------+
| 'plaidcloud' |
+--------------------------------+
SQL Syntax
TRIM([{BOTH | LEADING | TRAILING} [remstr] FROM ] str)
SQL Examples
Please note that ALL the examples in this section will return the string 'databend'.
The following example removes the leading and trailing string 'xxx' from the string 'xxxdatabendxxx':
SELECT TRIM(BOTH 'xxx' FROM 'xxxdatabendxxx');
The following example removes the leading string 'xxx' from the string 'xxxdatabend':
SELECT TRIM(LEADING 'xxx' FROM 'xxxdatabend' );
The following example removes the trailing string 'xxx' from the string 'databendxxx':
SELECT TRIM(TRAILING 'xxx' FROM 'databendxxx' );
If no remove string is specified, the function removes all leading and trailing spaces. The following examples remove the leading and/or trailing spaces:
SELECT TRIM(' databend ');
SELECT TRIM(' databend');
SELECT TRIM('databend ');
16.53 - UCASE
Alias for UPPER.
16.54 - UNHEX
For a string argument str, UNHEX(str) interprets each pair of characters in the argument as a hexadecimal number and converts it to the byte represented by the number. The return value is a binary string.
Analyze Syntax
Analyze Examples
func.unhex('6461746162656e64')
+--------------------------------+
| func.unhex('6461746162656e64') |
+--------------------------------+
| 6461746162656E64 |
+--------------------------------+
SQL Syntax
Aliases
SQL Examples
SELECT UNHEX('6461746162656e64') as c1, typeof(c1),UNHEX('6461746162656e64')::varchar as c2, typeof(c2), FROM_HEX('6461746162656e64');
┌─────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────┐
│ c1 │ typeof(c1) │ c2 | typeof(c2) | from_hex('6461746162656e64') |
├───────────────────────────┼────────────────────────|──────────────────┤───────────────────|─────────────────────────────────┤
│ 6461746162656E64 │ binary │ databend | varchar | 6461746162656E64 |
└─────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────┘
SELECT UNHEX(HEX('string')), unhex(HEX('string'))::varchar;
┌──────────────────────────────────────────────────────┐
│ unhex(hex('string')) │ unhex(hex('string'))::varchar │
├──────────────────────┼───────────────────────────────┤
│ 737472696E67 │ string │
└──────────────────────────────────────────────────────┘
16.55 - UPPER
Returns a string with all characters changed to uppercase.
Analyze Syntax
Analyze Examples
func.upper('hello, plaidcloud lakehouse!')
+--------------------------------------------+
| func.upper('hello, plaidcloud lakehouse!') |
+--------------------------------------------+
| 'HELLO, PLAIDCLOUD LAKEHOUSE!' |
+--------------------------------------------+
SQL Syntax
Aliases
Return Type
VARCHAR
SQL Examples
SELECT UPPER('hello, databend!'), UCASE('hello, databend!');
┌───────────────────────────────────────────────────────┐
│ upper('hello, databend!') │ ucase('hello, databend!') │
├───────────────────────────┼───────────────────────────┤
│ HELLO, DATABEND! │ HELLO, DATABEND! │
└───────────────────────────────────────────────────────┘
17 - System Functions
This section provides reference information for the system-related functions in PlaidCloud Lakehouse.
List of Functions:
17.1 - CLUSTERING_INFORMATION
Returns clustering information of a table.
SQL Syntax
CLUSTERING_INFORMATION('<database_name>', '<table_name>')
SQL Examples
CREATE TABLE mytable(a int, b int) CLUSTER BY(a+1);
INSERT INTO mytable VALUES(1,1),(3,3);
INSERT INTO mytable VALUES(2,2),(5,5);
INSERT INTO mytable VALUES(4,4);
SELECT * FROM CLUSTERING_INFORMATION('default','mytable')\G
*************************** 1. row ***************************
cluster_key: ((a + 1))
total_block_count: 3
constant_block_count: 1
unclustered_block_count: 0
average_overlaps: 1.3333
average_depth: 2.0
block_depth_histogram: {"00002":3}
Parameter | Description |
---|
cluster_key | The defined cluster key. |
total_block_count | The current count of blocks. |
constant_block_count | The count of blocks where min/max values are equal, meaning each block contains only one (group of) cluster_key value. |
unclustered_block_count | The count of blocks that have not yet been clustered. |
average_overlaps | The average ratio of overlapping blocks within a given range. |
average_depth | The average depth of overlapping partitions for the cluster key. |
block_depth_histogram | The number of partitions at each depth level. A higher concentration of partitions at lower depths indicates more effective table clustering. |
17.2 - FUSE_BLOCK
Returns the block information of the latest or specified snapshot of a table. For more information about what is block in PlaidCloud Lakehouse, see What are Snapshot, Segment, and Block?.
The command returns the location information of each parquet file referenced by a snapshot. This enables downstream applications to access and consume the data stored in the files.
See Also:
SQL Syntax
FUSE_BLOCK('<database_name>', '<table_name>'[, '<snapshot_id>'])
SQL Examples
CREATE TABLE mytable(c int);
INSERT INTO mytable values(1);
INSERT INTO mytable values(2);
SELECT * FROM FUSE_BLOCK('default', 'mytable');
---
+----------------------------------+----------------------------+----------------------------------------------------+------------+----------------------------------------------------+-------------------+
| snapshot_id | timestamp | block_location | block_size | bloom_filter_location | bloom_filter_size |
+----------------------------------+----------------------------+----------------------------------------------------+------------+----------------------------------------------------+-------------------+
| 51e84b56458f44269b05a059b364a659 | 2022-09-15 07:14:14.137268 | 1/7/_b/39a6dbbfd9b44ad5a8ec8ab264c93cf5_v0.parquet | 4 | 1/7/_i/39a6dbbfd9b44ad5a8ec8ab264c93cf5_v1.parquet | 221 |
| 51e84b56458f44269b05a059b364a659 | 2022-09-15 07:14:14.137268 | 1/7/_b/d0ee9688c4d24d6da86acd8b0d6f4fad_v0.parquet | 4 | 1/7/_i/d0ee9688c4d24d6da86acd8b0d6f4fad_v1.parquet | 219 |
+----------------------------------+----------------------------+----------------------------------------------------+------------+----------------------------------------------------+-------------------+
17.3 - FUSE_COLUMN
Returns the column information of the latest or specified snapshot of a table. For more information about what is block in PlaidCloud Lakehouse, see What are Snapshot, Segment, and Block?.
See Also:
SQL Syntax
FUSE_COLUMN('<database_name>', '<table_name>'[, '<snapshot_id>'])
SQL Examples
CREATE TABLE mytable(c int);
INSERT INTO mytable values(1);
INSERT INTO mytable values(2);
SELECT * FROM FUSE_COLUMN('default', 'mytable');
---
+----------------------------------+----------------------------+---------------------------------------------------------+------------+-----------+-----------+-------------+-------------+-----------+--------------+------------------+
| snapshot_id | timestamp | block_location | block_size | file_size | row_count | column_name | column_type | column_id | block_offset | bytes_compressed |
+----------------------------------+----------------------------+---------------------------------------------------------+------------+-----------+-----------+-------------+-------------+-----------+--------------+------------------+
| 3faefc1a9b6a48f388a8b59228dd06c1 | 2023-07-18 03:06:30.276502 | 1/118746/_b/44df130c207745cb858928135d39c1c0_v2.parquet | 4 | 196 | 1 | c | Int32 | 0 | 8 | 14 |
| 3faefc1a9b6a48f388a8b59228dd06c1 | 2023-07-18 03:06:30.276502 | 1/118746/_b/b6f8496d7e3f4f62a89c09572840cf70_v2.parquet | 4 | 196 | 1 | c | Int32 | 0 | 8 | 14 |
+----------------------------------+----------------------------+---------------------------------------------------------+------------+-----------+-----------+-------------+-------------+-----------+--------------+------------------+
17.4 - FUSE_ENCODING
import FunctionDescription from '@site/src/components/FunctionDescription';
Returns the encoding types applied to a specific column within a table. It helps you understand how data is compressed and stored in a native format within the table.
SQL Syntax
FUSE_ENCODING('<database_name>', '<table_name>', '<column_name>')
The function returns a result set with the following columns:
Column | Data Type | Description |
---|
VALIDITY_SIZE | Nullable(UInt32) | The size of a bitmap value that indicates whether each row in the column has a non-null value. This bitmap is used to track the presence or absence of null values in the column's data. |
COMPRESSED_SIZE | UInt32 | The size of the column data after compression. |
UNCOMPRESSED_SIZE | UInt32 | The size of the column data before applying encoding. |
LEVEL_ONE | String | The primary or initial encoding applied to the column. |
LEVEL_TWO | Nullable(String) | A secondary or recursive encoding method applied to the column after the initial encoding. |
SQL Examples
-- Create a table with an integer column 'c' and apply 'Lz4' compression
CREATE TABLE t(c INT) STORAGE_FORMAT = 'native' COMPRESSION = 'lz4';
-- Insert data into the table.
INSERT INTO t SELECT number FROM numbers(2048);
-- Analyze the encoding for column 'c' in table 't'
SELECT LEVEL_ONE, LEVEL_TWO, COUNT(*)
FROM FUSE_ENCODING('default', 't', 'c')
GROUP BY LEVEL_ONE, LEVEL_TWO;
level_one |level_two|count(*)|
------------+---------+--------+
DeltaBitpack| | 1|
-- Insert 2,048 rows with the value 1 into the table 't'
INSERT INTO t (c)
SELECT 1
FROM numbers(2048);
SELECT LEVEL_ONE, LEVEL_TWO, COUNT(*)
FROM FUSE_ENCODING('default', 't', 'c')
GROUP BY LEVEL_ONE, LEVEL_TWO;
level_one |level_two|count(*)|
------------+---------+--------+
OneValue | | 1|
DeltaBitpack| | 1|
17.5 - FUSE_SEGMENT
Returns the segment information of a specified table snapshot. For more information about what is segment in PlaidCloud Lakehouse, see What are Snapshot, Segment, and Block?.
See Also:
SQL Syntax
FUSE_SEGMENT('<database_name>', '<table_name>','<snapshot_id>')
SQL Examples
CREATE TABLE mytable(c int);
INSERT INTO mytable values(1);
INSERT INTO mytable values(2);
-- Obtain a snapshot ID
SELECT snapshot_id FROM FUSE_SNAPSHOT('default', 'mytable') limit 1;
---
+----------------------------------+
| snapshot_id |
+----------------------------------+
| 82c572947efa476892bd7c0635158ba2 |
+----------------------------------+
SELECT * FROM FUSE_SEGMENT('default', 'mytable', '82c572947efa476892bd7c0635158ba2');
---
+----------------------------------------------------+----------------+-------------+-----------+--------------------+------------------+
| file_location | format_version | block_count | row_count | bytes_uncompressed | bytes_compressed |
+----------------------------------------------------+----------------+-------------+-----------+--------------------+------------------+
| 1/319/_sg/d35fe7bf99584301b22e8f6a8a9c97f9_v1.json | 1 | 1 | 1 | 4 | 184 |
| 1/319/_sg/c261059d47c840e1b749222dabb4b2bb_v1.json | 1 | 1 | 1 | 4 | 184 |
+----------------------------------------------------+----------------+-------------+-----------+--------------------+------------------+
17.6 - FUSE_SNAPSHOT
Returns the snapshot information of a table. For more information about what is snapshot in PlaidCloud Lakehouse, see What are Snapshot, Segment, and Block?.
See Also:
SQL Syntax
FUSE_SNAPSHOT('<database_name>', '<table_name>')
SQL Examples
CREATE TABLE mytable(a int, b int) CLUSTER BY(a+1);
INSERT INTO mytable VALUES(1,1),(3,3);
INSERT INTO mytable VALUES(2,2),(5,5);
INSERT INTO mytable VALUES(4,4);
SELECT * FROM FUSE_SNAPSHOT('default','mytable');
---
| snapshot_id | snapshot_location | format_version | previous_snapshot_id | segment_count | block_count | row_count | bytes_uncompressed | bytes_compressed | index_size | timestamp |
|----------------------------------|------------------------------------------------------------|----------------|----------------------------------|---------------|-------------|-----------|--------------------|------------------|------------|----------------------------|
| a13d211b7421432898a3786848b8ced3 | 670655/783287/_ss/a13d211b7421432898a3786848b8ced3_v1.json | 1 | \N | 1 | 1 | 2 | 16 | 290 | 363 | 2022-09-19 14:51:52.860425 |
| cf08e6af6c134642aeb76bc81e6e7580 | 670655/783287/_ss/cf08e6af6c134642aeb76bc81e6e7580_v1.json | 1 | a13d211b7421432898a3786848b8ced3 | 2 | 2 | 4 | 32 | 580 | 726 | 2022-09-19 14:52:15.282943 |
| 1bd4f68b831a402e8c42084476461aa1 | 670655/783287/_ss/1bd4f68b831a402e8c42084476461aa1_v1.json | 1 | cf08e6af6c134642aeb76bc81e6e7580 | 3 | 3 | 5 | 40 | 862 | 1085 | 2022-09-19 14:52:20.284347 |
17.7 - FUSE_STATISTIC
Returns the estimated number of distinct values of each column in a table.
SQL Syntax
FUSE_STATISTIC('<database_name>', '<table_name>')
SQL Examples
You're most likely to use this function together with ANALYZE TABLE <table_name>
to generate and check the statistical information of a table. For more explanations and examples, see OPTIMIZE TABLE.
18 - Table Functions
This section provides reference information for the table-related functions in PlaidCloud Lakehouse.
18.1 - GENERATE_SERIES
import FunctionDescription from '@site/src/components/FunctionDescription';
Generates a dataset starting from a specified point, ending at another specified point, and optionally with an incrementing value. The GENERATE_SERIES function works with the following data types:
Analyze Syntax
func.generate_series(<start>, <stop>[, <step_interval>])
Analyze Examples
func.generate_series(1, 10, 2);
generate_series|
---------------+
1|
3|
5|
7|
9|
SQL Syntax
GENERATE_SERIES(<start>, <stop>[, <step_interval>])
Arguments
Argument | Description |
---|
start | The starting value, representing the first number, date, or timestamp in the sequence. |
stop | The ending value, representing the last number, date, or timestamp in the sequence. |
step_interval | The step interval, determining the difference between adjacent values in the sequence. For integer sequences, the default value is 1. For date sequences, the default step interval is 1 day. For timestamp sequences, the default step interval is 1 microsecond. |
Note: When dealing with functions like GENERATE_SERIES and RANGE, a key distinction lies in their boundary traits. GENERATE_SERIES is bound by both the left and right sides, while RANGE is bound on the left side only. For example, utilizing RANGE(1, 11) is equivalent to GENERATE_SERIES(1, 10).
Return Type
Returns a list containing a continuous sequence of numeric values, dates, or timestamps from start to stop.
SQL Examples
SQL Examples 1: Generating Numeric, Date, and Timestamp Data
SELECT * FROM GENERATE_SERIES(1, 10, 2);
generate_series|
---------------+
1|
3|
5|
7|
9|
SELECT * FROM GENERATE_SERIES('2023-03-20'::date, '2023-03-27'::date);
generate_series|
---------------+
2023-03-20|
2023-03-21|
2023-03-22|
2023-03-23|
2023-03-24|
2023-03-25|
2023-03-26|
2023-03-27|
SELECT * FROM GENERATE_SERIES('2023-03-26 00:00'::timestamp, '2023-03-27 12:00'::timestamp, 86400000000);
generate_series |
-------------------+
2023-03-26 00:00:00|
2023-03-27 00:00:00|
SQL Examples 2: Filling Query Result Gaps
This example uses the GENERATE_SERIES function and left join operator to handle gaps in query results caused by missing information in specific ranges.
CREATE TABLE t_metrics (
date Date,
value INT
);
INSERT INTO t_metrics VALUES
('2020-01-01', 200),
('2020-01-01', 300),
('2020-01-04', 300),
('2020-01-04', 300),
('2020-01-05', 400),
('2020-01-10', 700);
SELECT date, SUM(value), COUNT() FROM t_metrics GROUP BY date ORDER BY date;
date |sum(value)|count()|
----------+----------+-------+
2020-01-01| 500| 2|
2020-01-04| 600| 2|
2020-01-05| 400| 1|
2020-01-10| 700| 1|
To close the gaps between January 1st and January 10th, 2020, use the following query:
SELECT t.date, COALESCE(SUM(t_metrics.value), 0), COUNT(t_metrics.value)
FROM generate_series(
'2020-01-01'::Date,
'2020-01-10'::Date
) AS t(date)
LEFT JOIN t_metrics ON t_metrics.date = t.date
GROUP BY t.date ORDER BY t.date;
date |coalesce(sum(t_metrics.value), 0)|count(t_metrics.value)|
----------+---------------------------------+----------------------+
2020-01-01| 500| 2|
2020-01-02| 0| 0|
2020-01-03| 0| 0|
2020-01-04| 600| 2|
2020-01-05| 400| 1|
2020-01-06| 0| 0|
2020-01-07| 0| 0|
2020-01-08| 0| 0|
2020-01-09| 0| 0|
2020-01-10| 700| 1|
18.2 - INFER_SCHEMA
Automatically detects the file metadata schema and retrieves the column definitions.
Caution: infer_schema
currently only supports parquet file format.
SQL Syntax
INFER_SCHEMA(
LOCATION => '{ internalStage | externalStage }'
[ PATTERN => '<regex_pattern>']
)
Where:
internalStage
internalStage ::= @<internal_stage_name>[/<path>]
externalStage
externalStage ::= @<external_stage_name>[/<path>]
PATTERN = 'regex_pattern'
A PCRE2-based regular expression pattern string, enclosed in single quotes, specifying the file names to match. Click here to see an example. For PCRE2 syntax, see http://www.pcre.org/current/doc/html/pcre2syntax.html.
SQL Examples
Generate a parquet file in a stage:
CREATE STAGE infer_parquet FILE_FORMAT = (TYPE = PARQUET);
COPY INTO @infer_parquet FROM (SELECT * FROM numbers(10)) FILE_FORMAT = (TYPE = PARQUET);
LIST @infer_parquet;
+-------------------------------------------------------+------+------------------------------------+-------------------------------+---------+
| name | size | md5 | last_modified | creator |
+-------------------------------------------------------+------+------------------------------------+-------------------------------+---------+
| data_e0fd9cba-f45c-4c43-aa07-d6d87d134378_0_0.parquet | 258 | "7DCC9FFE04EA1F6882AED2CF9640D3D4" | 2023-02-09 05:21:52.000 +0000 | NULL |
+-------------------------------------------------------+------+------------------------------------+-------------------------------+---------+
infer_schema
SELECT * FROM INFER_SCHEMA(location => '@infer_parquet/data_e0fd9cba-f45c-4c43-aa07-d6d87d134378_0_0.parquet');
+-------------+-----------------+----------+----------+
| column_name | type | nullable | order_id |
+-------------+-----------------+----------+----------+
| number | BIGINT UNSIGNED | 0 | 0 |
+-------------+-----------------+----------+----------+
infer_schema
with Pattern Matching
SELECT * FROM infer_schema(location => '@infer_parquet/', pattern => '.*parquet');
+-------------+-----------------+----------+----------+
| column_name | type | nullable | order_id |
+-------------+-----------------+----------+----------+
| number | BIGINT UNSIGNED | 0 | 0 |
+-------------+-----------------+----------+----------+
Create a Table From Parquet File
The infer_schema
can only display the schema of a parquet file and cannot create a table from it.
To create a table from a parquet file:
CREATE TABLE mytable AS SELECT * FROM @infer_parquet/ (pattern=>'.*parquet') LIMIT 0;
DESC mytable;
+--------+-----------------+------+---------+-------+
| Field | Type | Null | Default | Extra |
+--------+-----------------+------+---------+-------+
| number | BIGINT UNSIGNED | NO | 0 | |
+--------+-----------------+------+---------+-------+
18.3 - INSPECT_PARQUET
import FunctionDescription from '@site/src/components/FunctionDescription';
Retrieves a table of comprehensive metadata from a staged Parquet file, including the following columns:
Column | Description |
---|
created_by | The entity or source responsible for creating the Parquet file |
num_columns | The number of columns in the Parquet file |
num_rows | The total number of rows or records in the Parquet file |
num_row_groups | The count of row groups within the Parquet file |
serialized_size | The size of the Parquet file on disk (compressed) |
max_row_groups_size_compressed | The size of the largest row group (compressed) |
max_row_groups_size_uncompressed | The size of the largest row group (uncompressed) |
SQL Syntax
INSPECT_PARQUET('@<path-to-file>')
SQL Examples
This example retrieves the metadata from a staged sample Parquet file named books.parquet. The file contains two records:
Transaction Processing,Jim Gray,1992
Readings in Database Systems,Michael Stonebraker,2004
-- Show the staged file
LIST @my_internal_stage;
┌──────────────────────────────────────────────────────────────────────────────────────────────┐
│ name │ size │ md5 │ last_modified │ creator │
├───────────────┼────────┼──────────────────┼───────────────────────────────┼──────────────────┤
│ books.parquet │ 998 │ NULL │ 2023-04-19 19:34:51.303 +0000 │ NULL │
└──────────────────────────────────────────────────────────────────────────────────────────────┘
-- Retrieve metadata from the staged file
SELECT * FROM INSPECT_PARQUET('@my_internal_stage/books.parquet');
┌────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────┐
│ created_by │ num_columns │ num_rows │ num_row_groups │ serialized_size │ max_row_groups_size_compressed │ max_row_groups_size_uncompressed │
├────────────────────────────────────┼─────────────┼──────────┼────────────────┼─────────────────┼────────────────────────────────┼──────────────────────────────────┤
│ parquet-cpp version 1.5.1-SNAPSHOT │ 3 │ 2 │ 1 │ 998 │ 332 │ 320 │
└────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────┘
18.4 - LIST_STAGE
import FunctionDescription from '@site/src/components/FunctionDescription';
Lists files in a stage. This allows you to filter files in a stage based on their extensions and obtain comprehensive details about each file. The function is similar to the DDL command LIST STAGE FILES, but provides you the flexibility to retrieve specific file information with the SELECT statement, such as file name, size, MD5 hash, last modified timestamp, and creator, rather than all file information.
SQL Syntax
LIST_STAGE(
LOCATION => '{ internalStage | externalStage | userStage }'
[ PATTERN => '<regex_pattern>']
)
Where:
internalStage
internalStage ::= @<internal_stage_name>[/<path>]
externalStage
externalStage ::= @<external_stage_name>[/<path>]
userStage
userStage ::= @~[/<path>]
PATTERN
See COPY INTO table.
SQL Examples
SELECT * FROM list_stage(location => '@my_stage/', pattern => '.*[.]log');
+----------------+------+------------------------------------+-------------------------------+---------+
| name | size | md5 | last_modified | creator |
+----------------+------+------------------------------------+-------------------------------+---------+
| 2023/meta.log | 475 | "4208ff530b252236e14b3cd797abdfbd" | 2023-04-19 20:23:24.000 +0000 | NULL |
| 2023/query.log | 1348 | "1c6654b207472c277fc8c6207c035e18" | 2023-04-19 20:23:24.000 +0000 | NULL |
+----------------+------+------------------------------------+-------------------------------+---------+
-- Equivalent to the following statement:
LIST @my_stage PATTERN = '.log';
18.5 - RESULT_SCAN
Returns the result set of a previous command in same session as if the result was a table.
SQL Syntax
RESULT_SCAN( { '<query_id>' | LAST_QUERY_ID() } )
SQL Examples
Create a simple table:
Insert some values;
INSERT INTO t1(a) VALUES (1), (2), (3);
result_scan
SELECT * FROM t1 ORDER BY a;
+-------+
| a |
+-------+
| 1 |
+-------+
| 2 |
+-------+
| 3 |
+-------+
SELECT * FROM RESULT_SCAN(LAST_QUERY_ID()) ORDER BY a;
+-------+
| a |
+-------+
| 1 |
+-------+
| 2 |
+-------+
| 3 |
+-------+
19 - UUID Functions
This section provides reference information for the UUID-related functions in PlaidCloud Lakehouse.
19.1 - GEN_RANDOM_UUID
Generates a random UUID based on v4.
Analyze Syntax
SQL Examples
func.gen_random_uuid()
┌───────────────────────────────────────┐
│ func.gen_random_uuid() │
├───────────────────────────────────────|
│ f88e7efe-1bc2-494b-806b-3ffe90db8f47 │
└───────────────────────────────────────┘
SQL Syntax
Aliases
SQL Examples
SELECT GEN_RANDOM_UUID(), UUID();
┌─────────────────────────────────────────────────────────────────────────────┐
│ gen_random_uuid() │ uuid() │
├──────────────────────────────────────┼──────────────────────────────────────┤
│ f88e7efe-1bc2-494b-806b-3ffe90db8f47 │ f88e7efe-1bc2-494b-806b-3ffe90db8f47 │
└─────────────────────────────────────────────────────────────────────────────┘
19.2 - UUID
Alias for GEN_RANDOM_UUID.
20 - Window Functions
Overview
A window function operates on a group ("window") of related rows.
For each input row, a window function returns one output row that depends on the specific row passed to the function and the values of the other rows in the window.
There are two main types of order-sensitive window functions:
Rank-related functions
: Rank-related functions list information based on the "rank" of a row. For example, ranking stores in descending order by profit per year, the store with the most profit will be ranked 1, and the second-most profitable store will be ranked 2, and so on.
Window frame functions
: Window frame functions enable you to perform rolling operations, such as calculating a running total or a moving average, on a subset of the rows in the window.
List of Functions that Support Windows
The list below shows all the window functions.
Window Syntax
<function> ( [ <arguments> ] ) OVER ( { named window | inline window } )
named window ::=
{ window_name | ( window_name ) }
inline window ::=
[ PARTITION BY <expression_list> ]
[ ORDER BY <expression_list> ]
[ window frame ]
The named window
is a window that is defined in the WINDOW
clause of the SELECT
statement, eg: SELECT a, SUM(a) OVER w FROM t WINDOW w AS ( inline window )
.
The <function>
is one of (aggregate function, rank function, value function).
The OVER
clause specifies that the function is being used as a window function.
The PARTITION BY
sub-clause allows rows to be grouped into sub-groups, for example by city, by year, etc. The PARTITION BY
clause is optional. You can analyze an entire group of rows without breaking it into sub-groups.
The ORDER BY
clause orders rows within the window.
The window frame
clause specifies the window frame type and the window frame extent. The window frame
clause is optional. If you omit the window frame
clause, the default window frame type is RANGE
and the default window frame extent is UNBOUNDED PRECEDING AND CURRENT ROW
.
Window Frame Syntax
window frame
can be one of the following types:
cumulativeFrame ::=
{
{ ROWS | RANGE } BETWEEN UNBOUNDED PRECEDING AND CURRENT ROW
| { ROWS | RANGE } BETWEEN CURRENT ROW AND UNBOUNDED FOLLOWING
}
slidingFrame ::=
{
ROWS BETWEEN <N> { PRECEDING | FOLLOWING } AND <N> { PRECEDING | FOLLOWING }
| ROWS BETWEEN UNBOUNDED PRECEDING AND <N> { PRECEDING | FOLLOWING }
| ROWS BETWEEN <N> { PRECEDING | FOLLOWING } AND UNBOUNDED FOLLOWING
}
SQL Examples
Create the table
CREATE TABLE employees (
employee_id INT,
first_name VARCHAR,
last_name VARCHAR,
department VARCHAR,
salary INT
);
Insert data
INSERT INTO employees (employee_id, first_name, last_name, department, salary) VALUES
(1, 'John', 'Doe', 'IT', 75000),
(2, 'Jane', 'Smith', 'HR', 85000),
(3, 'Mike', 'Johnson', 'IT', 90000),
(4, 'Sara', 'Williams', 'Sales', 60000),
(5, 'Tom', 'Brown', 'HR', 82000),
(6, 'Ava', 'Davis', 'Sales', 62000),
(7, 'Olivia', 'Taylor', 'IT', 72000),
(8, 'Emily', 'Anderson', 'HR', 77000),
(9, 'Sophia', 'Lee', 'Sales', 58000),
(10, 'Ella', 'Thomas', 'IT', 67000);
Example 1: Ranking employees by salary
In this example, we use the RANK() function to rank employees based on their salaries in descending order. The highest salary will get a rank of 1, and the lowest salary will get the highest rank number.
SELECT employee_id, first_name, last_name, department, salary, RANK() OVER (ORDER BY salary DESC) AS rank
FROM employees;
Result:
employee_id | first_name | last_name | department | salary | rank |
---|
3 | Mike | Johnson | IT | 90000 | 1 |
2 | Jane | Smith | HR | 85000 | 2 |
5 | Tom | Brown | HR | 82000 | 3 |
8 | Emily | Anderson | HR | 77000 | 4 |
1 | John | Doe | IT | 75000 | 5 |
7 | Olivia | Taylor | IT | 72000 | 6 |
10 | Ella | Thomas | IT | 67000 | 7 |
6 | Ava | Davis | Sales | 62000 | 8 |
4 | Sara | Williams | Sales | 60000 | 9 |
9 | Sophia | Lee | Sales | 58000 | 10 |
Example 2: Calculating the total salary per department
In this example, we use the SUM() function with PARTITION BY to calculate the total salary paid per department. Each row will show the department and the total salary for that department.
SELECT department, SUM(salary) OVER (PARTITION BY department) AS total_salary
FROM employees;
Result:
department | total_salary |
---|
HR | 244000 |
HR | 244000 |
HR | 244000 |
IT | 304000 |
IT | 304000 |
IT | 304000 |
IT | 304000 |
Sales | 180000 |
Sales | 180000 |
Sales | 180000 |
Example 3: Calculating a running total of salaries per department
In this example, we use the SUM() function with a cumulative window frame to calculate a running total of salaries within each department. The running total is calculated based on the employee's salary ordered by their employee_id.
SELECT employee_id, first_name, last_name, department, salary,
SUM(salary) OVER (PARTITION BY department ORDER BY employee_id
ROWS BETWEEN UNBOUNDED PRECEDING AND CURRENT ROW) AS running_total
FROM employees;
Result:
employee_id | first_name | last_name | department | salary | running_total |
---|
2 | Jane | Smith | HR | 85000 | 85000 |
5 | Tom | Brown | HR | 82000 | 167000 |
8 | Emily | Anderson | HR | 77000 | 244000 |
1 | John | Doe | IT | 75000 | 75000 |
3 | Mike | Johnson | IT | 90000 | 165000 |
7 | Olivia | Taylor | IT | 72000 | 237000 |
10 | Ella | Thomas | IT | 67000 | 304000 |
4 | Sara | Williams | Sales | 60000 | 60000 |
6 | Ava | Davis | Sales | 62000 | 122000 |
9 | Sophia | Lee | Sales | 58000 | 180000 |
20.1 - CUME_DIST
Returns the cumulative distribution of a given value in a set of values. It calculates the proportion of rows that have values less than or equal to the specified value, divided by the total number of rows. Please note that the resulting value falls between 0 and 1, inclusive.
See also: PERCENT_RANK
Analyze Syntax
func.cume_dist().over(partition_by=[<columns>], order_by=[<columns>])
Analyze Examples
table.name, table.score, table.grade, func.cume_dist().over(partition_by=[table.grade], order_by=table.score).alias('cume_dist_val')
name |score|grade|cume_dist_val|
--------+-----+-----+-------------+
Smith | 81|A | 0.25|
Davies | 84|A | 0.5|
Evans | 87|A | 0.75|
Johnson | 100|A | 1.0|
Taylor | 62|B | 0.5|
Brown | 62|B | 0.5|
Wilson | 72|B | 1.0|
Thomas | 72|B | 1.0|
Jones | 55|C | 1.0|
Williams| 55|C | 1.0|
SQL Syntax
CUME_DIST() OVER (
PARTITION BY expr, ...
ORDER BY expr [ASC | DESC], ...
)
SQL Examples
This example retrieves the students' names, scores, grades, and the cumulative distribution values (cume_dist_val) within each grade using the CUME_DIST() window function.
CREATE TABLE students (
name VARCHAR(20),
score INT NOT NULL,
grade CHAR(1) NOT NULL
);
INSERT INTO students (name, score, grade)
VALUES
('Smith', 81, 'A'),
('Jones', 55, 'C'),
('Williams', 55, 'C'),
('Taylor', 62, 'B'),
('Brown', 62, 'B'),
('Davies', 84, 'A'),
('Evans', 87, 'A'),
('Wilson', 72, 'B'),
('Thomas', 72, 'B'),
('Johnson', 100, 'A');
SELECT
name,
score,
grade,
CUME_DIST() OVER (PARTITION BY grade ORDER BY score) AS cume_dist_val
FROM
students;
name |score|grade|cume_dist_val|
--------+-----+-----+-------------+
Smith | 81|A | 0.25|
Davies | 84|A | 0.5|
Evans | 87|A | 0.75|
Johnson | 100|A | 1.0|
Taylor | 62|B | 0.5|
Brown | 62|B | 0.5|
Wilson | 72|B | 1.0|
Thomas | 72|B | 1.0|
Jones | 55|C | 1.0|
Williams| 55|C | 1.0|
20.2 - DENSE_RANK
Returns the rank of a value within a group of values, without gaps in the ranks.
The rank value starts at 1 and continues up sequentially.
If two values are the same, they have the same rank.
Analyze Syntax
func.dense_rank().over(partition_by=[<columns>], order_by=[<columns>])
Analyze Examples
table.department, func.sum(salary), func.dense_rank().over(order_by=func.sum(table.salary).desc()).alias('dense_rank')
| department | total_salary | dense_rank |
|------------|--------------|------------|
| IT | 172000 | 1 |
| HR | 160000 | 2 |
| Sales | 77000 | 3 |
SQL Syntax
DENSE_RANK() OVER ( [ PARTITION BY <expr1> ] ORDER BY <expr2> [ ASC | DESC ] [ <window_frame> ] )
SQL Examples
Create the table
CREATE TABLE employees (
employee_id INT,
first_name VARCHAR,
last_name VARCHAR,
department VARCHAR,
salary INT
);
Insert data
INSERT INTO employees (employee_id, first_name, last_name, department, salary) VALUES
(1, 'John', 'Doe', 'IT', 90000),
(2, 'Jane', 'Smith', 'HR', 85000),
(3, 'Mike', 'Johnson', 'IT', 82000),
(4, 'Sara', 'Williams', 'Sales', 77000),
(5, 'Tom', 'Brown', 'HR', 75000);
Calculating the total salary per department using DENSE_RANK
SELECT
department,
SUM(salary) AS total_salary,
DENSE_RANK() OVER (ORDER BY SUM(salary) DESC) AS dense_rank
FROM
employees
GROUP BY
department;
Result:
department | total_salary | dense_rank |
---|
IT | 172000 | 1 |
HR | 160000 | 2 |
Sales | 77000 | 3 |
20.3 - FIRST
Alias for FIRST_VALUE.
20.4 - FIRST_VALUE
Returns the first value from an ordered group of values.
See also:
Analyze Syntax
func.first_value(<expr>).over(partition_by=[<columns>], order_by=[<columns>])
Analyze Examples
table.employee_id, table.first_name, table.last_name, table.salary, func.first_value(table.first_name).over(order_by=table.salary.desc()).alias('highest_salary_first_name')
employee_id | first_name | last_name | salary | highest_salary_first_name
------------+------------+-----------+---------+--------------------------
4 | Mary | Williams | 7000.00 | Mary
2 | Jane | Smith | 6000.00 | Mary
3 | David | Johnson | 5500.00 | Mary
1 | John | Doe | 5000.00 | Mary
5 | Michael | Brown | 4500.00 | Mary
SQL Syntax
FIRST_VALUE(expression) OVER ([PARTITION BY partition_expression] ORDER BY order_expression [window_frame])
For the syntax of window frame, see Window Frame Syntax.
SQL Examples
CREATE TABLE employees (
employee_id INT,
first_name VARCHAR(50),
last_name VARCHAR(50),
salary DECIMAL(10,2)
);
INSERT INTO employees (employee_id, first_name, last_name, salary)
VALUES
(1, 'John', 'Doe', 5000.00),
(2, 'Jane', 'Smith', 6000.00),
(3, 'David', 'Johnson', 5500.00),
(4, 'Mary', 'Williams', 7000.00),
(5, 'Michael', 'Brown', 4500.00);
-- Use FIRST_VALUE to retrieve the first name of the employee with the highest salary
SELECT employee_id, first_name, last_name, salary,
FIRST_VALUE(first_name) OVER (ORDER BY salary DESC) AS highest_salary_first_name
FROM employees;
employee_id | first_name | last_name | salary | highest_salary_first_name
------------+------------+-----------+---------+--------------------------
4 | Mary | Williams | 7000.00 | Mary
2 | Jane | Smith | 6000.00 | Mary
3 | David | Johnson | 5500.00 | Mary
1 | John | Doe | 5000.00 | Mary
5 | Michael | Brown | 4500.00 | Mary
20.5 - LAG
LAG allows you to access the value of a column from a preceding row within the same result set. It is typically used to retrieve the value of a column in the previous row, based on a specified ordering.
See also: LEAD
Analyze Syntax
func.lag(<expr>, <offset>).over(partition_by=[<columns>], order_by=[<columns>])
Analyze Examples
table.product_name, table.sale_amount, func.lag(table.sale_amount, 1).over(partition_by=table.product_name, order_by=table.sale_id).alias('previous_sale_amount')
product_name | sale_amount | previous_sale_amount
-----------------------------------------------
Product A | 1000.00 | NULL
Product A | 1500.00 | 1000.00
Product A | 2000.00 | 1500.00
Product B | 500.00 | NULL
Product B | 800.00 | 500.00
Product B | 1200.00 | 800.00
SQL Syntax
LAG(expression [, offset [, default]]) OVER (PARTITION BY partition_expression ORDER BY sort_expression)
- offset: Specifies the number of rows ahead (LEAD) or behind (LAG) the current row within the partition to retrieve the value from. Defaults to 1.
Note that setting a negative offset has the same effect as using the LEAD function.
- default: Specifies a value to be returned if the LEAD or LAG function encounters a situation where there is no value available due to the offset exceeding the partition's boundaries. Defaults to NULL.
SQL Examples
CREATE TABLE sales (
sale_id INT,
product_name VARCHAR(50),
sale_amount DECIMAL(10, 2)
);
INSERT INTO sales (sale_id, product_name, sale_amount)
VALUES (1, 'Product A', 1000.00),
(2, 'Product A', 1500.00),
(3, 'Product A', 2000.00),
(4, 'Product B', 500.00),
(5, 'Product B', 800.00),
(6, 'Product B', 1200.00);
SELECT product_name, sale_amount, LAG(sale_amount) OVER (PARTITION BY product_name ORDER BY sale_id) AS previous_sale_amount
FROM sales;
product_name | sale_amount | previous_sale_amount
-----------------------------------------------
Product A | 1000.00 | NULL
Product A | 1500.00 | 1000.00
Product A | 2000.00 | 1500.00
Product B | 500.00 | NULL
Product B | 800.00 | 500.00
Product B | 1200.00 | 800.00
-- The following statements return the same result.
SELECT product_name, sale_amount, LAG(sale_amount, -1) OVER (PARTITION BY product_name ORDER BY sale_id) AS next_sale_amount
FROM sales;
SELECT product_name, sale_amount, LEAD(sale_amount) OVER (PARTITION BY product_name ORDER BY sale_id) AS next_sale_amount
FROM sales;
product_name|sale_amount|next_sale_amount|
------------+-----------+----------------+
Product A | 1000.00| 1500.00|
Product A | 1500.00| 2000.00|
Product A | 2000.00| |
Product B | 500.00| 800.00|
Product B | 800.00| 1200.00|
Product B | 1200.00| |
20.6 - LAST
Alias for LAST_VALUE.
20.7 - LAST_VALUE
Returns the last value from an ordered group of values.
See also:
Analyze Syntax
func.last_value(<expr>).over(partition_by=[<columns>], order_by=[<columns>])
Analyze Examples
table.employee_id, table.first_name, table.last_name, table.salary, func.last_value(table.first_name).over(order_by=table.salary.desc()).alias('lowest_salary_first_name')
employee_id | first_name | last_name | salary | lowest_salary_first_name
------------+------------+-----------+---------+------------------------
4 | Mary | Williams | 7000.00 | Michael
2 | Jane | Smith | 6000.00 | Michael
3 | David | Johnson | 5500.00 | Michael
1 | John | Doe | 5000.00 | Michael
5 | Michael | Brown | 4500.00 | Michael
SQL Syntax
LAST_VALUE(expression) OVER ([PARTITION BY partition_expression] ORDER BY order_expression [window_frame])
For the syntax of window frame, see Window Frame Syntax.
SQL Examples
CREATE TABLE employees (
employee_id INT,
first_name VARCHAR(50),
last_name VARCHAR(50),
salary DECIMAL(10,2)
);
INSERT INTO employees (employee_id, first_name, last_name, salary)
VALUES
(1, 'John', 'Doe', 5000.00),
(2, 'Jane', 'Smith', 6000.00),
(3, 'David', 'Johnson', 5500.00),
(4, 'Mary', 'Williams', 7000.00),
(5, 'Michael', 'Brown', 4500.00);
-- Use LAST_VALUE to retrieve the first name of the employee with the lowest salary
SELECT employee_id, first_name, last_name, salary,
LAST_VALUE(first_name) OVER (ORDER BY salary DESC) AS lowest_salary_first_name
FROM employees;
employee_id | first_name | last_name | salary | lowest_salary_first_name
------------+------------+-----------+---------+------------------------
4 | Mary | Williams | 7000.00 | Michael
2 | Jane | Smith | 6000.00 | Michael
3 | David | Johnson | 5500.00 | Michael
1 | John | Doe | 5000.00 | Michael
5 | Michael | Brown | 4500.00 | Michael
20.8 - LEAD
LEAD allows you to access the value of a column from a subsequent row within the same result set. It is typically used to retrieve the value of a column in the next row, based on a specified ordering.
See also: LAG
Analyze Syntax
func.lead(<expr>, <offset>).over(partition_by=[<columns>], order_by=[<columns>])
Analyze Examples
table.product_name, table.sale_amount, func.lead(table.sale_amount, 1).over(partition_by=table.product_name, order_by=table.sale_id).alias('next_sale_amount')
product_name | sale_amount | next_sale_amount
----------------------------------------------
Product A | 1000.00 | 1500.00
Product A | 1500.00 | 2000.00
Product A | 2000.00 | NULL
Product B | 500.00 | 800.00
Product B | 800.00 | 1200.00
Product B | 1200.00 | NULL
SQL Syntax
LEAD(expression [, offset [, default]]) OVER (PARTITION BY partition_expression ORDER BY sort_expression)
- offset: Specifies the number of rows ahead (LEAD) or behind (LAG) the current row within the partition to retrieve the value from. Defaults to 1.
Note that setting a negative offset has the same effect as using the LAG function.
- default: Specifies a value to be returned if the LEAD or LAG function encounters a situation where there is no value available due to the offset exceeding the partition's boundaries. Defaults to NULL.
SQL Examples
CREATE TABLE sales (
sale_id INT,
product_name VARCHAR(50),
sale_amount DECIMAL(10, 2)
);
INSERT INTO sales (sale_id, product_name, sale_amount)
VALUES (1, 'Product A', 1000.00),
(2, 'Product A', 1500.00),
(3, 'Product A', 2000.00),
(4, 'Product B', 500.00),
(5, 'Product B', 800.00),
(6, 'Product B', 1200.00);
SELECT product_name, sale_amount, LEAD(sale_amount) OVER (PARTITION BY product_name ORDER BY sale_id) AS next_sale_amount
FROM sales;
product_name | sale_amount | next_sale_amount
----------------------------------------------
Product A | 1000.00 | 1500.00
Product A | 1500.00 | 2000.00
Product A | 2000.00 | NULL
Product B | 500.00 | 800.00
Product B | 800.00 | 1200.00
Product B | 1200.00 | NULL
-- The following statements return the same result.
SELECT product_name, sale_amount, LEAD(sale_amount, -1) OVER (PARTITION BY product_name ORDER BY sale_id) AS previous_sale_amount
FROM sales;
SELECT product_name, sale_amount, LAG(sale_amount) OVER (PARTITION BY product_name ORDER BY sale_id) AS previous_sale_amount
FROM sales;
product_name|sale_amount|previous_sale_amount|
------------+-----------+--------------------+
Product A | 1000.00| |
Product A | 1500.00| 1000.00|
Product A | 2000.00| 1500.00|
Product B | 500.00| |
Product B | 800.00| 500.00|
Product B | 1200.00| 800.00|
20.9 - NTH_VALUE
Returns the Nth value from an ordered group of values.
See also:
Analyze Syntax
func.nth_value(<expr>, <n>).over(partition_by=[<columns>], order_by=[<columns>])
Analyze Examples
table.employee_id, table.first_name, table.last_name, table.salary, func.nth_value(table.first_name, 2).over(order_by=table.salary.desc()).alias('second_highest_salary_first_name')
employee_id | first_name | last_name | salary | second_highest_salary_first_name
------------+------------+-----------+---------+----------------------------------
4 | Mary | Williams | 7000.00 | Jane
2 | Jane | Smith | 6000.00 | Jane
3 | David | Johnson | 5500.00 | Jane
1 | John | Doe | 5000.00 | Jane
5 | Michael | Brown | 4500.00 | Jane
SQL Syntax
NTH_VALUE(expression, n) OVER ([PARTITION BY partition_expression] ORDER BY order_expression [window_frame])
For the syntax of window frame, see Window Frame Syntax.
SQL Examples
CREATE TABLE employees (
employee_id INT,
first_name VARCHAR(50),
last_name VARCHAR(50),
salary DECIMAL(10,2)
);
INSERT INTO employees (employee_id, first_name, last_name, salary)
VALUES
(1, 'John', 'Doe', 5000.00),
(2, 'Jane', 'Smith', 6000.00),
(3, 'David', 'Johnson', 5500.00),
(4, 'Mary', 'Williams', 7000.00),
(5, 'Michael', 'Brown', 4500.00);
-- Use NTH_VALUE to retrieve the first name of the employee with the second highest salary
SELECT employee_id, first_name, last_name, salary,
NTH_VALUE(first_name, 2) OVER (ORDER BY salary DESC) AS second_highest_salary_first_name
FROM employees;
employee_id | first_name | last_name | salary | second_highest_salary_first_name
------------+------------+-----------+---------+----------------------------------
4 | Mary | Williams | 7000.00 | Jane
2 | Jane | Smith | 6000.00 | Jane
3 | David | Johnson | 5500.00 | Jane
1 | John | Doe | 5000.00 | Jane
5 | Michael | Brown | 4500.00 | Jane
20.10 - NTILE
Divides the sorted result set into a specified number of buckets or groups. It evenly distributes the sorted rows into these buckets and assigns a bucket number to each row. The NTILE function is typically used with the ORDER BY clause to sort the results.
Please note that the NTILE function evenly distributes the rows into buckets based on the sorting order of the rows and ensures that the number of rows in each bucket is as equal as possible. If the number of rows cannot be evenly distributed into the buckets, some buckets may have one extra row compared to the others.
Analyze Syntax
func.ntile(<n>).over(partition_by=[<columns>], order_by=[<columns>])
Analyze Examples
table.name, table.score, table.grade, func.ntile(3).over(partition_by=[table.grade], order_by=table.score).alias('bucket')
name |score|grade|bucket|
--------+-----+-----+------+
Johnson | 100|A | 1|
Evans | 87|A | 1|
Davies | 84|A | 2|
Smith | 81|A | 3|
Wilson | 72|B | 1|
Thomas | 72|B | 1|
Taylor | 62|B | 2|
Brown | 62|B | 3|
Jones | 55|C | 1|
Williams| 55|C | 2|
SQL Syntax
NTILE(n) OVER (
PARTITION BY expr, ...
ORDER BY expr [ASC | DESC], ...
)
SQL Examples
This example retrieves the students' names, scores, grades, and assigns them to buckets based on their scores within each grade using the NTILE() window function.
CREATE TABLE students (
name VARCHAR(20),
score INT NOT NULL,
grade CHAR(1) NOT NULL
);
INSERT INTO students (name, score, grade)
VALUES
('Smith', 81, 'A'),
('Jones', 55, 'C'),
('Williams', 55, 'C'),
('Taylor', 62, 'B'),
('Brown', 62, 'B'),
('Davies', 84, 'A'),
('Evans', 87, 'A'),
('Wilson', 72, 'B'),
('Thomas', 72, 'B'),
('Johnson', 100, 'A');
SELECT
name,
score,
grade,
ntile(3) OVER (PARTITION BY grade ORDER BY score DESC) AS bucket
FROM
students;
name |score|grade|bucket|
--------+-----+-----+------+
Johnson | 100|A | 1|
Evans | 87|A | 1|
Davies | 84|A | 2|
Smith | 81|A | 3|
Wilson | 72|B | 1|
Thomas | 72|B | 1|
Taylor | 62|B | 2|
Brown | 62|B | 3|
Jones | 55|C | 1|
Williams| 55|C | 2|
20.11 - PERCENT_RANK
Returns the relative rank of a given value within a set of values. The resulting value falls between 0 and 1, inclusive. Please note that the first row in any set has a PERCENT_RANK of 0.
See also: CUME_DIST
Analyze Syntax
func.percent_rank().over(partition_by=[<columns>], order_by=[<columns>])
Analyze Examples
table.name, table.score, table.grade, func.percent_rank().over(partition_by=[table.grade], order_by=table.score).alias('percent_rank')
name |score|grade|percent_rank |
--------+-----+-----+------------------+
Smith | 81|A | 0.0|
Davies | 84|A |0.3333333333333333|
Evans | 87|A |0.6666666666666666|
Johnson | 100|A | 1.0|
Taylor | 62|B | 0.0|
Brown | 62|B | 0.0|
Wilson | 72|B |0.6666666666666666|
Thomas | 72|B |0.6666666666666666|
Jones | 55|C | 0.0|
Williams| 55|C | 0.0|
SQL Syntax
PERCENT_RANK() OVER (
PARTITION BY expr, ...
ORDER BY expr [ASC | DESC], ...
)
SQL Examples
This example retrieves the students' names, scores, grades, and the percentile ranks (percent_rank) within each grade using the PERCENT_RANK() window function.
CREATE TABLE students (
name VARCHAR(20),
score INT NOT NULL,
grade CHAR(1) NOT NULL
);
INSERT INTO students (name, score, grade)
VALUES
('Smith', 81, 'A'),
('Jones', 55, 'C'),
('Williams', 55, 'C'),
('Taylor', 62, 'B'),
('Brown', 62, 'B'),
('Davies', 84, 'A'),
('Evans', 87, 'A'),
('Wilson', 72, 'B'),
('Thomas', 72, 'B'),
('Johnson', 100, 'A');
SELECT
name,
score,
grade,
PERCENT_RANK() OVER (PARTITION BY grade ORDER BY score) AS percent_rank
FROM
students;
name |score|grade|percent_rank |
--------+-----+-----+------------------+
Smith | 81|A | 0.0|
Davies | 84|A |0.3333333333333333|
Evans | 87|A |0.6666666666666666|
Johnson | 100|A | 1.0|
Taylor | 62|B | 0.0|
Brown | 62|B | 0.0|
Wilson | 72|B |0.6666666666666666|
Thomas | 72|B |0.6666666666666666|
Jones | 55|C | 0.0|
Williams| 55|C | 0.0|
20.12 - RANK
The RANK() function assigns a unique rank to each value within an ordered group of values.
The rank value starts at 1 and continues up sequentially. If two values are the same, they have the same rank.
Analyze Syntax
func.rank().over(partition_by=[<columns>], order_by=[<columns>])
Analyze Examples
table.employee_id, table.first_name, table.last_name, table.department, table.salary, func.rank().over(order_by=table.salary).alias('rank')
| employee_id | first_name | last_name | department | salary | rank |
|-------------|------------|-----------|------------|--------|------|
| 1 | John | Doe | IT | 90000 | 1 |
| 2 | Jane | Smith | HR | 85000 | 2 |
| 3 | Mike | Johnson | IT | 82000 | 3 |
| 4 | Sara | Williams | Sales | 77000 | 4 |
| 5 | Tom | Brown | HR | 75000 | 5 |
SQL Syntax
RANK() OVER (
[ PARTITION BY <expr1> ]
ORDER BY <expr2> [ { ASC | DESC } ]
[ <window_frame> ]
)
SQL Examples
Create the table
CREATE TABLE employees (
employee_id INT,
first_name VARCHAR,
last_name VARCHAR,
department VARCHAR,
salary INT
);
Insert data
INSERT INTO employees (employee_id, first_name, last_name, department, salary) VALUES
(1, 'John', 'Doe', 'IT', 90000),
(2, 'Jane', 'Smith', 'HR', 85000),
(3, 'Mike', 'Johnson', 'IT', 82000),
(4, 'Sara', 'Williams', 'Sales', 77000),
(5, 'Tom', 'Brown', 'HR', 75000);
Ranking employees by salary
SELECT
employee_id,
first_name,
last_name,
department,
salary,
RANK() OVER (ORDER BY salary DESC) AS rank
FROM
employees;
Result:
employee_id | first_name | last_name | department | salary | rank |
---|
1 | John | Doe | IT | 90000 | 1 |
2 | Jane | Smith | HR | 85000 | 2 |
3 | Mike | Johnson | IT | 82000 | 3 |
4 | Sara | Williams | Sales | 77000 | 4 |
5 | Tom | Brown | HR | 75000 | 5 |
20.13 - ROW_NUMBER
Assigns a temporary sequential number to each row within a partition of a result set, starting at 1 for the first row in each partition.
Analyze Syntax
func.row_number().over(partition_by=[<columns>], order_by=[<columns>])
Analyze Examples
table.employee_id, table.first_name, table.last_name, table.department, table.salary, func.row_number().over(partition=table.department, order_by=table.salary).alias('row_num')
┌──────────────────────────────────────────────────────────────────────────────────────────────────────┐
│ employee_id │ first_name │ last_name │ department │ salary │ row_num │
├─────────────────┼──────────────────┼──────────────────┼──────────────────┼─────────────────┼─────────┤
│ 2 │ Jane │ Smith │ HR │ 85000 │ 1 │
│ 5 │ Tom │ Brown │ HR │ 75000 │ 2 │
│ 1 │ John │ Doe │ IT │ 90000 │ 1 │
│ 3 │ Mike │ Johnson │ IT │ 82000 │ 2 │
│ 4 │ Sara │ Williams │ Sales │ 77000 │ 1 │
└──────────────────────────────────────────────────────────────────────────────────────────────────────┘
SQL Syntax
ROW_NUMBER()
OVER ( [ PARTITION BY <expr1> [, <expr2> ... ] ]
ORDER BY <expr3> [ , <expr4> ... ] [ { ASC | DESC } ] )
Parameter | Required? | Description |
---|
ORDER BY | Yes | Specifies the order of rows within each partition. |
ASC / DESC | No | Specifies the sorting order within each partition. ASC (ascending) is the default. |
QUALIFY | No | Filters rows based on conditions. |
SQL Examples
This example demonstrates the use of ROW_NUMBER() to assign sequential numbers to employees within their departments, ordered by descending salary.
-- Prepare the data
CREATE TABLE employees (
employee_id INT,
first_name VARCHAR,
last_name VARCHAR,
department VARCHAR,
salary INT
);
INSERT INTO employees (employee_id, first_name, last_name, department, salary) VALUES
(1, 'John', 'Doe', 'IT', 90000),
(2, 'Jane', 'Smith', 'HR', 85000),
(3, 'Mike', 'Johnson', 'IT', 82000),
(4, 'Sara', 'Williams', 'Sales', 77000),
(5, 'Tom', 'Brown', 'HR', 75000);
-- Select employee details along with the row number partitioned by department and ordered by salary in descending order.
SELECT
employee_id,
first_name,
last_name,
department,
salary,
ROW_NUMBER() OVER (PARTITION BY department ORDER BY salary DESC) AS row_num
FROM
employees;
┌──────────────────────────────────────────────────────────────────────────────────────────────────────┐
│ employee_id │ first_name │ last_name │ department │ salary │ row_num │
├─────────────────┼──────────────────┼──────────────────┼──────────────────┼─────────────────┼─────────┤
│ 2 │ Jane │ Smith │ HR │ 85000 │ 1 │
│ 5 │ Tom │ Brown │ HR │ 75000 │ 2 │
│ 1 │ John │ Doe │ IT │ 90000 │ 1 │
│ 3 │ Mike │ Johnson │ IT │ 82000 │ 2 │
│ 4 │ Sara │ Williams │ Sales │ 77000 │ 1 │
└──────────────────────────────────────────────────────────────────────────────────────────────────────┘