This section provides reference information for the table-related functions in PlaidCloud Lakehouse.
This is the multi-page printable view of this section. Click here to print.
Table Functions
- 1: GENERATE_SERIES
- 2: INFER_SCHEMA
- 3: INSPECT_PARQUET
- 4: LIST_STAGE
- 5: RESULT_SCAN
- 6: SHOW_GRANTS
- 7: STREAM_STATUS
- 8: TASK_HISTORY
1 - GENERATE_SERIES
Generates a dataset starting from a specified point, ending at another specified point, and optionally with an incrementing value. The GENERATE_SERIES function works with the following data types:
- Integer
- Date
- Timestamp
Analyze Syntax
func.generate_series(<start>, <stop>[, <step_interval>])
Analyze Examples
func.generate_series(1, 10, 2);
generate_series|
---------------+
1|
3|
5|
7|
9|
SQL Syntax
GENERATE_SERIES(<start>, <stop>[, <step_interval>])
Arguments
Argument | Description |
---|---|
start | The starting value, representing the first number, date, or timestamp in the sequence. |
stop | The ending value, representing the last number, date, or timestamp in the sequence. |
step_interval | The step interval, determining the difference between adjacent values in the sequence. For integer sequences, the default value is 1. For date sequences, the default step interval is 1 day. For timestamp sequences, the default step interval is 1 microsecond. |
Return Type
Returns a list containing a continuous sequence of numeric values, dates, or timestamps from start to stop.
SQL Examples
SQL Examples 1: Generating Numeric, Date, and Timestamp Data
SELECT * FROM GENERATE_SERIES(1, 10, 2);
generate_series|
---------------+
1|
3|
5|
7|
9|
SELECT * FROM GENERATE_SERIES('2023-03-20'::date, '2023-03-27'::date);
generate_series|
---------------+
2023-03-20|
2023-03-21|
2023-03-22|
2023-03-23|
2023-03-24|
2023-03-25|
2023-03-26|
2023-03-27|
SELECT * FROM GENERATE_SERIES('2023-03-26 00:00'::timestamp, '2023-03-27 12:00'::timestamp, 86400000000);
generate_series |
-------------------+
2023-03-26 00:00:00|
2023-03-27 00:00:00|
SQL Examples 2: Filling Query Result Gaps
This example uses the GENERATE_SERIES function and left join operator to handle gaps in query results caused by missing information in specific ranges.
CREATE TABLE t_metrics (
date Date,
value INT
);
INSERT INTO t_metrics VALUES
('2020-01-01', 200),
('2020-01-01', 300),
('2020-01-04', 300),
('2020-01-04', 300),
('2020-01-05', 400),
('2020-01-10', 700);
SELECT date, SUM(value), COUNT() FROM t_metrics GROUP BY date ORDER BY date;
date |sum(value)|count()|
----------+----------+-------+
2020-01-01| 500| 2|
2020-01-04| 600| 2|
2020-01-05| 400| 1|
2020-01-10| 700| 1|
To close the gaps between January 1st and January 10th, 2020, use the following query:
SELECT t.date, COALESCE(SUM(t_metrics.value), 0), COUNT(t_metrics.value)
FROM generate_series(
'2020-01-01'::Date,
'2020-01-10'::Date
) AS t(date)
LEFT JOIN t_metrics ON t_metrics.date = t.date
GROUP BY t.date ORDER BY t.date;
date |coalesce(sum(t_metrics.value), 0)|count(t_metrics.value)|
----------+---------------------------------+----------------------+
2020-01-01| 500| 2|
2020-01-02| 0| 0|
2020-01-03| 0| 0|
2020-01-04| 600| 2|
2020-01-05| 400| 1|
2020-01-06| 0| 0|
2020-01-07| 0| 0|
2020-01-08| 0| 0|
2020-01-09| 0| 0|
2020-01-10| 700| 1|
2 - INFER_SCHEMA
Automatically detects the file metadata schema and retrieves the column definitions.
infer_schema
currently only supports parquet file format.SQL Syntax
INFER_SCHEMA(
LOCATION => '{ internalStage | externalStage }'
[ PATTERN => '<regex_pattern>']
)
Where:
internalStage
internalStage ::= @<internal_stage_name>[/<path>]
externalStage
externalStage ::= @<external_stage_name>[/<path>]
PATTERN = 'regex_pattern'
A PCRE2-based regular expression pattern string, enclosed in single quotes, specifying the file names to match. Click here to see an example. For PCRE2 syntax, see http://www.pcre.org/current/doc/html/pcre2syntax.html.
SQL Examples
Generate a parquet file in a stage:
CREATE STAGE infer_parquet FILE_FORMAT = (TYPE = PARQUET);
COPY INTO @infer_parquet FROM (SELECT * FROM numbers(10)) FILE_FORMAT = (TYPE = PARQUET);
LIST @infer_parquet;
+-------------------------------------------------------+------+------------------------------------+-------------------------------+---------+
| name | size | md5 | last_modified | creator |
+-------------------------------------------------------+------+------------------------------------+-------------------------------+---------+
| data_e0fd9cba-f45c-4c43-aa07-d6d87d134378_0_0.parquet | 258 | "7DCC9FFE04EA1F6882AED2CF9640D3D4" | 2023-02-09 05:21:52.000 +0000 | NULL |
+-------------------------------------------------------+------+------------------------------------+-------------------------------+---------+
infer_schema
SELECT * FROM INFER_SCHEMA(location => '@infer_parquet/data_e0fd9cba-f45c-4c43-aa07-d6d87d134378_0_0.parquet');
+-------------+-----------------+----------+----------+
| column_name | type | nullable | order_id |
+-------------+-----------------+----------+----------+
| number | BIGINT UNSIGNED | 0 | 0 |
+-------------+-----------------+----------+----------+
infer_schema
with Pattern Matching
SELECT * FROM infer_schema(location => '@infer_parquet/', pattern => '.*parquet');
+-------------+-----------------+----------+----------+
| column_name | type | nullable | order_id |
+-------------+-----------------+----------+----------+
| number | BIGINT UNSIGNED | 0 | 0 |
+-------------+-----------------+----------+----------+
Create a Table From Parquet File
The infer_schema
can only display the schema of a parquet file and cannot create a table from it.
To create a table from a parquet file:
CREATE TABLE mytable AS SELECT * FROM @infer_parquet/ (pattern=>'.*parquet') LIMIT 0;
DESC mytable;
+--------+-----------------+------+---------+-------+
| Field | Type | Null | Default | Extra |
+--------+-----------------+------+---------+-------+
| number | BIGINT UNSIGNED | NO | 0 | |
+--------+-----------------+------+---------+-------+
3 - INSPECT_PARQUET
Retrieves a table of comprehensive metadata from a staged Parquet file, including the following columns:
Column | Description |
---|---|
created_by | The entity or source responsible for creating the Parquet file |
num_columns | The number of columns in the Parquet file |
num_rows | The total number of rows or records in the Parquet file |
num_row_groups | The count of row groups within the Parquet file |
serialized_size | The size of the Parquet file on disk (compressed) |
max_row_groups_size_compressed | The size of the largest row group (compressed) |
max_row_groups_size_uncompressed | The size of the largest row group (uncompressed) |
SQL Syntax
INSPECT_PARQUET('@<path-to-file>')
SQL Examples
This example retrieves the metadata from a staged sample Parquet file named books.parquet. The file contains two records:
Transaction Processing,Jim Gray,1992
Readings in Database Systems,Michael Stonebraker,2004
-- Show the staged file
LIST @my_internal_stage;
┌──────────────────────────────────────────────────────────────────────────────────────────────┐
│ name │ size │ md5 │ last_modified │ creator │
├───────────────┼────────┼──────────────────┼───────────────────────────────┼──────────────────┤
│ books.parquet │ 998 │ NULL │ 2023-04-19 19:34:51.303 +0000 │ NULL │
└──────────────────────────────────────────────────────────────────────────────────────────────┘
-- Retrieve metadata from the staged file
SELECT * FROM INSPECT_PARQUET('@my_internal_stage/books.parquet');
┌────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────┐
│ created_by │ num_columns │ num_rows │ num_row_groups │ serialized_size │ max_row_groups_size_compressed │ max_row_groups_size_uncompressed │
├────────────────────────────────────┼─────────────┼──────────┼────────────────┼─────────────────┼────────────────────────────────┼──────────────────────────────────┤
│ parquet-cpp version 1.5.1-SNAPSHOT │ 3 │ 2 │ 1 │ 998 │ 332 │ 320 │
└────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────┘
4 - LIST_STAGE
Lists files in a stage. This allows you to filter files in a stage based on their extensions and obtain comprehensive details about each file. The function is similar to the DDL command LIST STAGE FILES, but provides you the flexibility to retrieve specific file information with the SELECT statement, such as file name, size, MD5 hash, last modified timestamp, and creator, rather than all file information.
SQL Syntax
LIST_STAGE(
LOCATION => '{ internalStage | externalStage | userStage }'
[ PATTERN => '<regex_pattern>']
)
Where:
internalStage
internalStage ::= @<internal_stage_name>[/<path>]
externalStage
externalStage ::= @<external_stage_name>[/<path>]
userStage
userStage ::= @~[/<path>]
PATTERN
See COPY INTO table.
SQL Examples
SELECT * FROM list_stage(location => '@my_stage/', pattern => '.*[.]log');
+----------------+------+------------------------------------+-------------------------------+---------+
| name | size | md5 | last_modified | creator |
+----------------+------+------------------------------------+-------------------------------+---------+
| 2023/meta.log | 475 | "4208ff530b252236e14b3cd797abdfbd" | 2023-04-19 20:23:24.000 +0000 | NULL |
| 2023/query.log | 1348 | "1c6654b207472c277fc8c6207c035e18" | 2023-04-19 20:23:24.000 +0000 | NULL |
+----------------+------+------------------------------------+-------------------------------+---------+
-- Equivalent to the following statement:
LIST @my_stage PATTERN = '.log';
5 - RESULT_SCAN
Returns the result set of a previous command in same session as if the result was a table.
SQL Syntax
RESULT_SCAN( { '<query_id>' | LAST_QUERY_ID() } )
SQL Examples
Create a simple table:
CREATE TABLE t1(a int);
Insert some values;
INSERT INTO t1(a) VALUES (1), (2), (3);
result_scan
SELECT * FROM t1 ORDER BY a;
+-------+
| a |
+-------+
| 1 |
+-------+
| 2 |
+-------+
| 3 |
+-------+
SELECT * FROM RESULT_SCAN(LAST_QUERY_ID()) ORDER BY a;
+-------+
| a |
+-------+
| 1 |
+-------+
| 2 |
+-------+
| 3 |
+-------+
6 - SHOW_GRANTS
Lists privileges explicitly granted to a user, to a role, or on a specific object.
SQL Syntax
SHOW_GRANTS('role', '<role_name>')
SHOW_GRANTS('user', '<user_name>')
SHOW_GRANTS('stage', '<stage_name>')
SHOW_GRANTS('udf', '<udf_name>')
SHOW_GRANTS('table', '<table_name>', '<catalog_name>', '<db_name>')
SHOW_GRANTS('database', '<db_name>', '<catalog_name>')
Configuring enable_expand_roles
Setting
The enable_expand_roles
setting controls whether the SHOW_GRANTS function expands role inheritance when displaying privileges.
enable_expand_roles=1
(default):- SHOW_GRANTS recursively expands inherited privileges, meaning that if a role has been granted another role, it will display all the inherited privileges.
- Users will also see all privileges granted through their assigned roles.
enable_expand_roles=0
:- SHOW_GRANTS only displays privileges that are directly assigned to the specified role or user.
- However, the result will still include GRANT ROLE statements to indicate role inheritance.
For example, role a
has the SELECT
privilege on t1
, and role b
has the SELECT
privilege on t2
:
SELECT grants FROM show_grants('role', 'a') ORDER BY object_id;
┌──────────────────────────────────────────────────────┐
│ grants │
├──────────────────────────────────────────────────────┤
│ GRANT SELECT ON 'default'.'default'.'t1' TO ROLE `a` │
└──────────────────────────────────────────────────────┘
SELECT grants FROM show_grants('role', 'b') ORDER BY object_id;
┌──────────────────────────────────────────────────────┐
│ grants │
├──────────────────────────────────────────────────────┤
│ GRANT SELECT ON 'default'.'default'.'t2' TO ROLE `b` │
└──────────────────────────────────────────────────────┘
If you grant role b
to role a
and check the grants on role a
again, you can see than the SELECT
privilege on t2
is now included in role a
:
GRANT ROLE b TO ROLE a;
SELECT grants FROM show_grants('role', 'a') ORDER BY object_id;
┌──────────────────────────────────────────────────────┐
│ grants │
├──────────────────────────────────────────────────────┤
│ GRANT SELECT ON 'default'.'default'.'t1' TO ROLE `a` │
│ GRANT SELECT ON 'default'.'default'.'t2' TO ROLE `a` │
└──────────────────────────────────────────────────────┘
If you set enable_expand_roles
to 0
and check the grants on role a
again, the result will show the GRANT ROLE
statement instead of listing the specific privileges inherited from role b
:
SET enable_expand_roles=0;
SELECT grants FROM show_grants('role', 'a') ORDER BY object_id;
┌──────────────────────────────────────────────────────┐
│ grants │
├──────────────────────────────────────────────────────┤
│ GRANT SELECT ON 'default'.'default'.'t1' TO ROLE `a` │
│ GRANT ROLE b to ROLE `a` │
│ GRANT ROLE public to ROLE `a` │
└──────────────────────────────────────────────────────┘
SQL Examples
This example illustrates how to list privileges granted to a user, a role, and on a specific object.
-- Create a new user
CREATE USER 'user1' IDENTIFIED BY 'password';
-- Create a new role
CREATE ROLE analyst;
-- Grant the analyst role to the user
GRANT ROLE analyst TO 'user1';
-- Create a stage
CREATE STAGE my_stage;
-- Grant privileges on the stage to the role
GRANT READ ON STAGE my_stage TO ROLE analyst;
-- List privileges granted to the user
SELECT * FROM SHOW_GRANTS('user', 'user1');
┌───────────────────────────────────────────────────────────────────────────────────────────────────────────────┐
│ privileges │ object_name │ object_id │ grant_to │ name │ grants │
├────────────┼─────────────┼──────────────────┼──────────┼────────┼─────────────────────────────────────────────┤
│ Read │ my_stage │ NULL │ USER │ user1 │ GRANT Read ON STAGE my_stage TO 'user1'@'%' │
└───────────────────────────────────────────────────────────────────────────────────────────────────────────────┘
-- List privileges granted to the role
SELECT * FROM SHOW_GRANTS('role', 'analyst');
┌───────────────────────────────────────────────────────────────────────────────────────────────────────────────────┐
│ privileges │ object_name │ object_id │ grant_to │ name │ grants │
├────────────┼─────────────┼──────────────────┼──────────┼─────────┼────────────────────────────────────────────────┤
│ Read │ my_stage │ NULL │ ROLE │ analyst │ GRANT Read ON STAGE my_stage TO ROLE `analyst` │
└───────────────────────────────────────────────────────────────────────────────────────────────────────────────────┘
-- List privileges granted on the stage
SELECT * FROM SHOW_GRANTS('stage', 'my_stage');
┌─────────────────────────────────────────────────────────────────────────────────────┐
│ privileges │ object_name │ object_id │ grant_to │ name │ grants │
├────────────┼─────────────┼──────────────────┼──────────┼─────────┼──────────────────┤
│ Read │ my_stage │ NULL │ ROLE │ analyst │ │
└─────────────────────────────────────────────────────────────────────────────────────┘
7 - STREAM_STATUS
Provides information about the status of a specified stream, yielding a single-column result (has_data
) that can take on values of true
or false
:
true
: Indicates that the stream might contain change data capture records.false
: Indicates that the stream currently does not contain any change data capture records.
:::note
The presence of a true
in the result (has_data
) does not ensure the definite existence of change data capture records. Other operations, such as performing a table compact operation, could also lead to a true
value even when there are no actual change data capture records.
:::
:::note
When using STREAM_STATUS
in tasks, you must include the database name when referencing the stream (e.g., STREAM_STATUS('mydb.stream_name')
).
:::
SQL Syntax
SELECT * FROM STREAM_STATUS('<database_name>.<stream_name>');
-- OR
SELECT * FROM STREAM_STATUS('<stream_name>'); -- Uses current database
SQL Examples
-- Create a table 't' with a column 'c'
CREATE TABLE t (c int);
-- Create a stream 's' on the table 't'
CREATE STREAM s ON TABLE t;
-- Check the initial status of the stream 's'
SELECT * FROM STREAM_STATUS('s');
-- The result should be 'false' indicating no change data capture records initially
┌──────────┐
│ has_data │
├──────────┤
│ false │
└──────────┘
-- Insert a value into the table 't'
INSERT INTO t VALUES (1);
-- Check the updated status of the stream 's' after the insertion
SELECT * FROM STREAM_STATUS('s');
-- The result should now be 'true' indicating the presence of change data capture records
┌──────────┐
│ has_data │
├──────────┤
│ true │
└──────────┘
-- Example with database name specified
SELECT * FROM STREAM_STATUS('mydb.s');
8 - TASK_HISTORY
Displays task running history given variables.
SQL Syntax
TASK_HISTORY(
[ SCHEDULED_TIME_RANGE_START => <constant_expr> ]
[, SCHEDULED_TIME_RANGE_END => <constant_expr> ]
[, RESULT_LIMIT => <integer> ]
[, TASK_NAME => '<string>' ]
[, ERROR_ONLY => { TRUE | FALSE } ]
[, ROOT_TASK_ID => '<string>'] )
Arguments
All the arguments are optional.
SCHEDULED_TIME_RANGE_START => <constant_expr>
, SCHEDULED_TIME_RANGE_END => <constant_expr>
Time range (in TIMESTAMP_LTZ format), within the last 7 days, in which the task execution was scheduled. If the time range does not fall within the last 7 days, an error is returned.
- If
SCHEDULED_TIME_RANGE_END
is not specified, the function returns those tasks that have already completed, are currently running, or are scheduled in the future. - If
SCHEDULED_TIME_RANGE_END
is CURRENT_TIMESTAMP, the function returns those tasks that have already completed or are currently running. Note that a task that is executed immediately before the current time might still be identified as scheduled. - To query only those tasks that have already completed or are currently running, include
WHERE query_id IS NOT NULL
as a filter. The QUERY_ID column in the TASK_HISTORY output is populated only when a task has started running.
If no start or end time is specified, the most recent tasks are returned, up to the specified RESULT_LIMIT value.
RESULT_LIMIT => <integer>
A number specifying the maximum number of rows returned by the function.
If the number of matching rows is greater than this limit, the task executions with the most recent timestamp are returned, up to the specified limit.
Range: 1
to 10000
Default: 100
.
TASK_NAME => <string>
A case-insensitive string specifying a task. Only non-qualified task names are supported. Only executions of the specified task are returned. Note that if multiple tasks have the same name, the function returns the history for each of these tasks.
ERROR_ONLY => { TRUE | FALSE }
When set to TRUE, this function returns only task runs that failed or were cancelled.
ROOT_TASK_ID => <string>
Unique identifier for the root task in a task graph. This ID matches the ID column value in the SHOW TASKS output for the same task. Specify the ROOT_TASK_ID to show the history of the root task and any child tasks that are part of the task graph.
Usage Notes
- This function returns a maximum of 10,000 rows, set in the RESULT_LIMIT argument value. The default value is 100.
- This function returns results only for the ACCOUNTADMIN role.
SQL Examples
SELECT
*
FROM TASK_HISTORY() order by scheduled_time;
The above SQL query retrieves all task history records from the TASK_HISTORY function, ordered by the scheduled_time column.(maximum 10,000)
SELECT *
FROM TASK_HISTORY(
SCHEDULED_TIME_RANGE_START=>TO_TIMESTAMP('2022-01-02T01:12:00-07:00'),
SCHEDULED_TIME_RANGE_END=>TO_TIMESTAMP('2022-01-02T01:12:30-07:00'))
The above SQL query retrieves all task history records from the TASK_HISTORY function where the scheduled time range starts at '2022-01-02T01:12:00-07:00' and ends at '2022-01-02T01:12:30-07:00'. This means it will return the tasks that were scheduled to run within this specific 30-second time window. The result will include details of the tasks that match this criteria.