This section provides reference information for the string-related functions in PlaidCloud Lakehouse.
String Manipulation:
- CONCAT
- CONCAT_WS
- INSERT
- LEFT
- LPAD
- REPEAT
- REPLACE
- REVERSE
- RIGHT
- RPAD
- SPLIT
- SPLIT_PART
- SUBSTR
- SUBSTRING
- TRANSLATE
- TRIM
This is the multi-page printable view of this section. Click here to print.
This section provides reference information for the string-related functions in PlaidCloud Lakehouse.
Returns the numeric value of the leftmost character of the string str.
func.ascii(<expr>)
func.ascii('2')
+-----------------+
| func.ascii('2') |
+-----------------+
| 50 |
+-----------------+
ASCII(<expr>)
Arguments | Description |
---|---|
<expr> | The string. |
TINYINT
SELECT ASCII('2');
+------------+
| ASCII('2') |
+------------+
| 50 |
+------------+
Returns a string representation of the binary value of N.
func.bin(<expr>)
func.bin(12)
+--------------+
| func.bin(12) |
+--------------+
| 1100 |
+--------------+
BIN(<expr>)
Arguments | Description |
---|---|
<expr> | The number. |
VARCHAR
SELECT BIN(12);
+---------+
| BIN(12) |
+---------+
| 1100 |
+---------+
Return the length of a string in bits.
func.bit_length(<expr>)
func.bit_length('Word')
+-------------------------+
| func.bit_length('Word') |
+-------------------------+
| 32 |
+-------------------------+
BIT_LENGTH(<expr>)
Arguments | Description |
---|---|
<expr> | The string. |
BIGINT
SELECT BIT_LENGTH('Word');
+----------------------------+
| SELECT BIT_LENGTH('Word'); |
+----------------------------+
| 32 |
+----------------------------+
Return the character for each integer passed.
func.char(N,...)
func.char(77,121,83,81,76)
+-----------------------------+
| func.char(77,121,83,81,76) |
+-----------------------------+
| 4D7953514C |
+-----------------------------+
CHAR(N, ...)
Arguments | Description |
---|---|
N | Numeric Column |
BINARY
This example shows both the binary value returned as well as the string representation.
SELECT CHAR(77,121,83,81,76) as a, a::String;
┌────────────────────────┐
│ a │ a::string │
│ Binary │ String │
├────────────┼───────────┤
│ 4D7953514C │ MySQL │
└────────────────────────┘
Alias for LENGTH.
Alias for LENGTH.
Returns the string that results from concatenating the arguments. May have one or more arguments. If all arguments are nonbinary strings, the result is a nonbinary string. If the arguments include any binary strings, the result is a binary string. A numeric argument is converted to its equivalent nonbinary string form.
func.concat(<expr1>, ...)
func.concat('data', 'bend')
+-----------------------------+
| func.concat('data', 'bend') |
+-----------------------------+
| databend |
+-----------------------------+
CONCAT(<expr1>, ...)
Arguments | Description |
---|---|
<expr1> | string |
A VARCHAR
data type value Or NULL
data type.
SELECT CONCAT('data', 'bend');
+------------------------+
| concat('data', 'bend') |
+------------------------+
| databend |
+------------------------+
SELECT CONCAT('data', NULL, 'bend');
+------------------------------+
| CONCAT('data', NULL, 'bend') |
+------------------------------+
| NULL |
+------------------------------+
SELECT CONCAT('14.3');
+----------------+
| concat('14.3') |
+----------------+
| 14.3 |
+----------------+
CONCAT_WS() stands for Concatenate With Separator and is a special form of CONCAT(). The first argument is the separator for the rest of the arguments. The separator is added between the strings to be concatenated. The separator can be a string, as can the rest of the arguments. If the separator is NULL, the result is NULL.
CONCAT_WS() does not skip empty strings. However, it does skip any NULL values after the separator argument.
func.concat_ws(<separator>, <expr1>, ...)
func.concat_ws(',', 'data', 'fuse', 'labs', '2021')
+-----------------------------------------------------+
| func.concat_ws(',', 'data', 'fuse', 'labs', '2021') |
+-----------------------------------------------------+
| data,fuse,labs,2021 |
+-----------------------------------------------------+
CONCAT_WS(<separator>, <expr1>, ...)
Arguments | Description |
---|---|
<separator> | string column |
<expr1> | value column |
A VARCHAR
data type value Or NULL
data type.
SELECT CONCAT_WS(',', 'data', 'fuse', 'labs', '2021');
+------------------------------------------------+
| CONCAT_WS(',', 'data', 'fuse', 'labs', '2021') |
+------------------------------------------------+
| data,fuse,labs,2021 |
+------------------------------------------------+
SELECT CONCAT_WS(',', 'data', NULL, 'bend');
+--------------------------------------+
| CONCAT_WS(',', 'data', NULL, 'bend') |
+--------------------------------------+
| data,bend |
+--------------------------------------+
SELECT CONCAT_WS(',', 'data', NULL, NULL, 'bend');
+--------------------------------------------+
| CONCAT_WS(',', 'data', NULL, NULL, 'bend') |
+--------------------------------------------+
| data,bend |
+--------------------------------------------+
SELECT CONCAT_WS(NULL, 'data', 'fuse', 'labs');
+-----------------------------------------+
| CONCAT_WS(NULL, 'data', 'fuse', 'labs') |
+-----------------------------------------+
| NULL |
+-----------------------------------------+
SELECT CONCAT_WS(',', NULL);
+----------------------+
| CONCAT_WS(',', NULL) |
+----------------------+
| |
+----------------------+
Takes a string encoded with the base-64 encoded rules nd returns the decoded result as a binary. The result is NULL if the argument is NULL or not a valid base-64 string.
func.from_base64(<expr>)
func.from_base64('YWJj')
+--------------------------+
| func.from_base64('YWJj') |
+--------------------------+
| abc |
+--------------------------+
FROM_BASE64(<expr>)
Arguments | Description |
---|---|
<expr> | The string value. |
BINARY
SELECT TO_BASE64('abc'), FROM_BASE64(TO_BASE64('abc')) as b, b::String;
┌───────────────────────────────────────┐
│ to_base64('abc') │ b │ b::string │
│ String │ Binary │ String │
├──────────────────┼────────┼───────────┤
│ YWJj │ 616263 │ abc │
└───────────────────────────────────────┘
Alias for UNHEX.
Alias for TO_HEX.
Returns the string str, with the substring beginning at position pos and len characters long replaced by the string newstr. Returns the original string if pos is not within the length of the string. Replaces the rest of the string from position pos if len is not within the length of the rest of the string. Returns NULL if any argument is NULL.
func.insert(<str>, <pos>, <len>, <newstr>)
func.insert('Quadratic', 3, 4, 'What')
+----------------------------------------+
| func.insert('Quadratic', 3, 4, 'What') |
+----------------------------------------+
| QuWhattic |
+----------------------------------------+
INSERT(<str>, <pos>, <len>, <newstr>)
Arguments | Description |
---|---|
<str> | The string. |
<pos> | The position. |
<len> | The length. |
<newstr> | The new string. |
VARCHAR
SELECT INSERT('Quadratic', 3, 4, 'What');
+-----------------------------------+
| INSERT('Quadratic', 3, 4, 'What') |
+-----------------------------------+
| QuWhattic |
+-----------------------------------+
SELECT INSERT('Quadratic', -1, 4, 'What');
+---------------------------------------+
| INSERT('Quadratic', (- 1), 4, 'What') |
+---------------------------------------+
| Quadratic |
+---------------------------------------+
SELECT INSERT('Quadratic', 3, 100, 'What');
+-------------------------------------+
| INSERT('Quadratic', 3, 100, 'What') |
+-------------------------------------+
| QuWhat |
+-------------------------------------+
+--------------------------------------------+--------+
| INSERT('123456789', number, number, 'aaa') | number |
+--------------------------------------------+--------+
| 123456789 | 0 |
| aaa23456789 | 1 |
| 1aaa456789 | 2 |
| 12aaa6789 | 3 |
| 123aaa89 | 4 |
| 1234aaa | 5 |
| 12345aaa | 6 |
| 123456aaa | 7 |
| 1234567aaa | 8 |
| 12345678aaa | 9 |
| 123456789 | 10 |
| 123456789 | 11 |
| 123456789 | 12 |
+--------------------------------------------+--------+
Returns the position of the first occurrence of substring substr in string str. This is the same as the two-argument form of LOCATE(), except that the order of the arguments is reversed.
func.instr(<str>, <substr>)
func.instr('foobarbar', 'bar')
+--------------------------------+
| func.instr('foobarbar', 'bar') |
+--------------------------------+
| 4 |
+--------------------------------+
INSTR(<str>, <substr>)
Arguments | Description |
---|---|
<str> | The string. |
<substr> | The substring. |
BIGINT
SELECT INSTR('foobarbar', 'bar');
+---------------------------+
| INSTR('foobarbar', 'bar') |
+---------------------------+
| 4 |
+---------------------------+
SELECT INSTR('xbar', 'foobar');
+-------------------------+
| INSTR('xbar', 'foobar') |
+-------------------------+
| 0 |
+-------------------------+
Alias for LOWER.
Returns the leftmost len characters from the string str, or NULL if any argument is NULL.
func.left(<str>, <len>)
func.left('foobarbar', 5)
+---------------------------+
| func.left('foobarbar', 5) |
+---------------------------+
| fooba |
+---------------------------+
LEFT(<str>, <len>);
Arguments | Description |
---|---|
<str> | The main string from where the character to be extracted |
<len> | The count of characters |
VARCHAR
SELECT LEFT('foobarbar', 5);
+----------------------+
| LEFT('foobarbar', 5) |
+----------------------+
| fooba |
+----------------------+
Returns the length of a given input string or binary value. In the case of strings, the length represents the count of characters, with each UTF-8 character considered as a single character. For binary data, the length corresponds to the number of bytes.
func.length(<expr>)
func.length('Hello')
+----------------------+
| func.length('Hello') |
+----------------------+
| 5 |
+----------------------+
LENGTH(<expr>)
BIGINT
SELECT LENGTH('Hello'), LENGTH_UTF8('Hello'), CHAR_LENGTH('Hello'), CHARACTER_LENGTH('Hello');
┌───────────────────────────────────────────────────────────────────────────────────────────┐
│ length('hello') │ length_utf8('hello') │ char_length('hello') │ character_length('hello') │
├─────────────────┼──────────────────────┼──────────────────────┼───────────────────────────┤
│ 5 │ 5 │ 5 │ 5 │
└───────────────────────────────────────────────────────────────────────────────────────────┘
Alias for LENGTH.
Pattern matching using an SQL pattern. Returns 1 (TRUE) or 0 (FALSE). If either expr or pat is NULL, the result is NULL.
<column>.like('plaid%')
my_clothes.like('plaid%)
+-----------------+
| my_clothes |
+-----------------+
| plaid pants |
| plaid hat |
| plaid shirt |
+-----------------+
<expr> LIKE <pattern>
SELECT name, category FROM system.functions WHERE name like 'tou%' ORDER BY name;
+----------+------------+
| name | category |
+----------+------------+
| touint16 | conversion |
| touint32 | conversion |
| touint64 | conversion |
| touint8 | conversion |
+----------+------------+
The first syntax returns the position of the first occurrence of substring substr in string str. The second syntax returns the position of the first occurrence of substring substr in string str, starting at position pos. Returns 0 if substr is not in str. Returns NULL if any argument is NULL.
func.locate(<substr>, <str>, <pos>)
func.locate('bar', 'foobarbar')
+------------------------------------+
| func.locate('bar', 'foobarbar') |
+------------------------------------+
| 5 |
+------------------------------------+
func.locate('bar', 'foobarbar', 5)
+------------------------------------+
| func.locate('bar', 'foobarbar', 5) |
+------------------------------------+
| 7 |
+------------------------------------+
LOCATE(<substr>, <str>)
LOCATE(<substr>, <str>, <pos>)
Arguments | Description |
---|---|
<substr> | The substring. |
<str> | The string. |
<pos> | The position. |
BIGINT
SELECT LOCATE('bar', 'foobarbar')
+----------------------------+
| LOCATE('bar', 'foobarbar') |
+----------------------------+
| 4 |
+----------------------------+
SELECT LOCATE('xbar', 'foobar')
+--------------------------+
| LOCATE('xbar', 'foobar') |
+--------------------------+
| 0 |
+--------------------------+
SELECT LOCATE('bar', 'foobarbar', 5)
+-------------------------------+
| LOCATE('bar', 'foobarbar', 5) |
+-------------------------------+
| 7 |
+-------------------------------+
Returns a string with all characters changed to lowercase.
func.lower(<str>)
func.lower('Hello, PlaidCloud!')
+----------------------------------+
| func.lower('Hello, PlaidCloud!') |
+----------------------------------+
| hello, plaidcloud! |
+----------------------------------+
LOWER(<str>)
VARCHAR
SELECT LOWER('Hello, Databend!'), LCASE('Hello, Databend!');
┌───────────────────────────────────────────────────────┐
│ lower('hello, databend!') │ lcase('hello, databend!') │
├───────────────────────────┼───────────────────────────┤
│ hello, databend! │ hello, databend! │
└───────────────────────────────────────────────────────┘
Returns the string str, left-padded with the string padstr to a length of len characters. If str is longer than len, the return value is shortened to len characters.
func.lpad(<str>, <len>, <padstr>)
func.lpad('hi',4,'??')
+------------------------+
| func.lpad('hi',4,'??') |
+------------------------+
| ??hi |
+------------------------+
func.lpad('hi',1,'??')
+------------------------+
| func.lpad('hi',1,'??') |
+------------------------+
| h |
+------------------------+
LPAD(<str>, <len>, <padstr>)
Arguments | Description |
---|---|
<str> | The string. |
<len> | The length. |
<padstr> | The pad string. |
VARCHAR
SELECT LPAD('hi',4,'??');
+---------------------+
| LPAD('hi', 4, '??') |
+---------------------+
| ??hi |
+---------------------+
SELECT LPAD('hi',1,'??');
+---------------------+
| LPAD('hi', 1, '??') |
+---------------------+
| h |
+---------------------+
Alias for SUBSTR.
Pattern not matching using an SQL pattern. Returns 1 (TRUE) or 0 (FALSE). If either expr or pat is NULL, the result is NULL.
<column>.not_like(<pattern>)
my_clothes.not_like('%pants)
+-----------------+
| my_clothes |
+-----------------+
| plaid pants XL |
| plaid hat |
| plaid shirt |
+-----------------+
<expr> NOT LIKE <pattern>
SELECT name, category FROM system.functions WHERE name like 'tou%' AND name not like '%64' ORDER BY name;
+----------+------------+
| name | category |
+----------+------------+
| touint16 | conversion |
| touint32 | conversion |
| touint8 | conversion |
+----------+------------+
Returns 1 if the string expr doesn't match the regular expression specified by the pattern pat, 0 otherwise.
not_(<column>.regexp_match(<pattern>))
With an input table of:
+-----------------+
| my_clothes |
+-----------------+
| plaid pants |
| plaid hat |
| plaid shirt |
| shoes |
+-----------------+
not_(my_clothes.regexp_match('p*'))
+-------------------------------------+
| not_(my_clothes.regexp_match('p*')) |
+-------------------------------------+
| false |
| false |
| false |
| true |
+-------------------------------------+
<expr> NOT REGEXP <pattern>
SELECT 'databend' NOT REGEXP 'd*';
+------------------------------+
| ('databend' not regexp 'd*') |
+------------------------------+
| 0 |
+------------------------------+
Returns 1 if the string expr doesn't match the regular expression specified by the pattern pat, 0 otherwise.
not_(<column>.regexp_match(<pattern>))
With an input table of:
+-----------------+
| my_clothes |
+-----------------+
| plaid pants |
| plaid hat |
| plaid shirt |
| shoes |
+-----------------+
not_(my_clothes.regexp_match('p*'))
+-------------------------------------+
| not_(my_clothes.regexp_match('p*')) |
+-------------------------------------+
| false |
| false |
| false |
| true |
+-------------------------------------+
<expr> NOT RLIKE <pattern>
SELECT 'databend' not rlike 'd*';
+-----------------------------+
| ('databend' not rlike 'd*') |
+-----------------------------+
| 0 |
+-----------------------------+
Returns a string representation of the octal value of N.
func.oct(<expr>)
func.oct(12)
+-----------------+
| func.oct(12) |
+-----------------+
| 014 |
+-----------------+
OCT(<expr>)
SELECT OCT(12);
+---------+
| OCT(12) |
+---------+
| 014 |
+---------+
OCTET_LENGTH() is a synonym for LENGTH().
func.octet_length(<str>)
func.octet_length('databend')
+-------------------------------+
| func.octet_length('databend') |
+-------------------------------+
| 8 |
+-------------------------------+
OCTET_LENGTH(<str>)
SELECT OCTET_LENGTH('databend');
+--------------------------+
| OCTET_LENGTH('databend') |
+--------------------------+
| 8 |
+--------------------------+
If the leftmost character is not a multibyte character, ORD() returns the same value as the ASCII() function.
If the leftmost character of the string str is a multibyte character, returns the code for that character, calculated from the numeric values of its constituent bytes using this formula:
(1st byte code)
+ (2nd byte code * 256)
+ (3rd byte code * 256^2) ...
func.ord(<str>)
func.ord('2')
+----------------+
| func.ord('2) |
+----------------+
| 50 |
+----------------+
ORD(<str>)
Arguments | Description |
---|---|
<str> | The string. |
BIGINT
SELECT ORD('2')
+--------+
| ORD(2) |
+--------+
| 50 |
+--------+
POSITION(substr IN str) is a synonym for LOCATE(substr,str). Returns the position of the first occurrence of substring substr in string str. Returns 0 if substr is not in str. Returns NULL if any argument is NULL.
func.position(<substr>, <str>)
func.position('bar', 'foobarbar')
+-----------------------------------+
| func.position('bar', 'foobarbar') |
+-----------------------------------+
| 4 |
+-----------------------------------+
POSITION(<substr> IN <str>)
Arguments | Description |
---|---|
<substr> | The substring. |
<str> | The string. |
BIGINT
SELECT POSITION('bar' IN 'foobarbar')
+----------------------------+
| POSITION('bar' IN 'foobarbar') |
+----------------------------+
| 4 |
+----------------------------+
SELECT POSITION('xbar' IN 'foobar')
+--------------------------+
| POSITION('xbar' IN 'foobar') |
+--------------------------+
| 0 |
+--------------------------+
Quotes a string to produce a result that can be used as a properly escaped data value in an SQL statement.
func.quote(<str>)
func.quote('Don\'t')
+----------------------+
| func.quote('Don\'t') |
+----------------------+
| Don\'t! |
+----------------------+
QUOTE(<str>)
SELECT QUOTE('Don\'t!');
+-----------------+
| QUOTE('Don't!') |
+-----------------+
| Don\'t! |
+-----------------+
SELECT QUOTE(NULL);
+-------------+
| QUOTE(NULL) |
+-------------+
| NULL |
+-------------+
Returns true
if the string <expr>
matches the regular expression specified by the <pattern>
, false
otherwise.
<column>.regexp_match(<pattern>)
With an input table of:
+-----------------+
| my_clothes |
+-----------------+
| plaid pants |
| plaid hat |
| plaid shirt |
| shoes |
+-----------------+
my_clothes.regexp_match('p*')
+-------------------------------+
| my_clothes.regexp_match('p*') |
+-------------------------------+
| true |
| true |
| true |
| false |
+-------------------------------+
<expr> REGEXP <pattern>
SELECT 'databend' REGEXP 'd*', 'databend' RLIKE 'd*';
┌────────────────────────────────────────────────────┐
│ ('databend' regexp 'd*') │ ('databend' rlike 'd*') │
├──────────────────────────┼─────────────────────────┤
│ true │ true │
└────────────────────────────────────────────────────┘
Returns the starting index of the substring of the string expr
that matches the regular expression specified by the pattern pat
, 0
if there is no match. If expr
or pat
is NULL, the return value is NULL. Character indexes begin at 1
.
func.regexp_instr(<expr>, <pat[, pos[, occurrence[, return_option[, match_type]]]]>)
func.regexp_instr('dog cat dog', 'dog')
+-----------------------------------------+
| func.regexp_instr('dog cat dog', 'dog') |
+-----------------------------------------+
| 1 |
+-----------------------------------------+
REGEXP_INSTR(<expr>, <pat[, pos[, occurrence[, return_option[, match_type]]]]>)
Arguments | Description |
---|---|
expr | The string expr that to be matched |
pat | The regular expression |
pos | Optional. The position in expr at which to start the search. If omitted, the default is 1. |
occurrence | Optional. Which occurrence of a match to search for. If omitted, the default is 1. |
return_option | Optional. Which type of position to return. If this value is 0, REGEXP_INSTR() returns the position of the matched substring's first character. If this value is 1, REGEXP_INSTR() returns the position following the matched substring. If omitted, the default is 0. |
match_type | Optional. A string that specifies how to perform matching. The meaning is as described for REGEXP_LIKE(). |
A number data type value.
SELECT REGEXP_INSTR('dog cat dog', 'dog');
+------------------------------------+
| REGEXP_INSTR('dog cat dog', 'dog') |
+------------------------------------+
| 1 |
+------------------------------------+
SELECT REGEXP_INSTR('dog cat dog', 'dog', 2);
+---------------------------------------+
| REGEXP_INSTR('dog cat dog', 'dog', 2) |
+---------------------------------------+
| 9 |
+---------------------------------------+
SELECT REGEXP_INSTR('aa aaa aaaa', 'a{2}');
+-------------------------------------+
| REGEXP_INSTR('aa aaa aaaa', 'a{2}') |
+-------------------------------------+
| 1 |
+-------------------------------------+
SELECT REGEXP_INSTR('aa aaa aaaa', 'a{4}');
+-------------------------------------+
| REGEXP_INSTR('aa aaa aaaa', 'a{4}') |
+-------------------------------------+
| 8 |
+-------------------------------------+
REGEXP_LIKE function is used to check that whether the string matches the regular expression.
func.regexp_like(<expr>, <pat[, match_type]>)
func.regexp_like('a', '^[a-d]')
+---------------------------------+
| func.regexp_like('a', '^[a-d]') |
+---------------------------------+
| 1 |
+---------------------------------+
REGEXP_LIKE(<expr>, <pat[, match_type]>)
Arguments | Description |
---|---|
<expr> | The string expr that to be matched |
<pat> | The regular expression |
[match_type] | Optional. match_type argument is a string that specifying how to perform matching |
match_type
may contain any or all the following characters:
c
: Case-sensitive matching.i
: Case-insensitive matching.m
: Multiple-line mode. Recognize line terminators within the string. The default behavior is to match line terminators only at the start and end of the string expression.n
: The .
character matches line terminators. The default is for .
matching to stop at the end of a line.u
: Unix-only line endings. Not be supported now.BIGINT
Returns 1
if the string expr matches the regular expression specified by the pattern pat, 0
otherwise. If expr or pat is NULL, the return value is NULL.
SELECT REGEXP_LIKE('a', '^[a-d]');
+----------------------------+
| REGEXP_LIKE('a', '^[a-d]') |
+----------------------------+
| 1 |
+----------------------------+
SELECT REGEXP_LIKE('abc', 'ABC');
+---------------------------+
| REGEXP_LIKE('abc', 'ABC') |
+---------------------------+
| 1 |
+---------------------------+
SELECT REGEXP_LIKE('abc', 'ABC', 'c');
+--------------------------------+
| REGEXP_LIKE('abc', 'ABC', 'c') |
+--------------------------------+
| 0 |
+--------------------------------+
SELECT REGEXP_LIKE('new*\n*line', 'new\\*.\\*line');
+-------------------------------------------+
| REGEXP_LIKE('new*
*line', 'new\*.\*line') |
+-------------------------------------------+
| 0 |
+-------------------------------------------+
SELECT REGEXP_LIKE('new*\n*line', 'new\\*.\\*line', 'n');
+------------------------------------------------+
| REGEXP_LIKE('new*
*line', 'new\*.\*line', 'n') |
+------------------------------------------------+
| 1 |
+------------------------------------------------+
Replaces occurrences in the string expr
that match the regular expression specified by the pattern pat
with the replacement string repl
, and returns the resulting string. If expr
, pat
, or repl
is NULL, the return value is NULL.
func.regexp_replace(<expr>, <pat>, <repl[, pos[, occurrence[, match_type]]]>)
func.regexp_replace('a b c', 'b', 'X')
+----------------------------------------+
| func.regexp_replace('a b c', 'b', 'X') |
+----------------------------------------+
| a X c |
+----------------------------------------+
REGEXP_REPLACE(<expr>, <pat>, <repl[, pos[, occurrence[, match_type]]]>)
Arguments | Description |
---|---|
expr | The string expr that to be matched |
pat | The regular expression |
repl | The replacement string |
pos | Optional. The position in expr at which to start the search. If omitted, the default is 1. |
occurrence | Optional. Which occurrence of a match to replace. If omitted, the default is 0 (which means "replace all occurrences"). |
match_type | Optional. A string that specifies how to perform matching. The meaning is as described for REGEXP_LIKE(). |
VARCHAR
SELECT REGEXP_REPLACE('a b c', 'b', 'X');
+-----------------------------------+
| REGEXP_REPLACE('a b c', 'b', 'X') |
+-----------------------------------+
| a X c |
+-----------------------------------+
SELECT REGEXP_REPLACE('abc def ghi', '[a-z]+', 'X', 1, 3);
+----------------------------------------------------+
| REGEXP_REPLACE('abc def ghi', '[a-z]+', 'X', 1, 3) |
+----------------------------------------------------+
| abc def X |
+----------------------------------------------------+
SELECT REGEXP_REPLACE('周 周周 周周周', '周+', 'X', 3, 2);
+-----------------------------------------------------------+
| REGEXP_REPLACE('周 周周 周周周', '周+', 'X', 3, 2) |
+-----------------------------------------------------------+
| 周 周周 X |
+-----------------------------------------------------------+
Returns the substring of the string expr
that matches the regular expression specified by the pattern pat
, NULL if there is no match. If expr or pat is NULL, the return value is NULL.
func.regexp_substr(<expr>, <pat[, pos[, occurrence[, match_type]]]>)
func.regexp_substr('abc def ghi', '[a-z]+')
+---------------------------------------------+
| func.regexp_substr('abc def ghi', '[a-z]+') |
+---------------------------------------------+
| abc |
+---------------------------------------------+
REGEXP_SUBSTR(<expr>, <pat[, pos[, occurrence[, match_type]]]>)
Arguments | Description |
---|---|
expr | The string expr that to be matched |
pat | The regular expression |
pos | Optional. The position in expr at which to start the search. If omitted, the default is 1. |
occurrence | Optional. Which occurrence of a match to search for. If omitted, the default is 1. |
match_type | Optional. A string that specifies how to perform matching. The meaning is as described for REGEXP_LIKE(). |
VARCHAR
SELECT REGEXP_SUBSTR('abc def ghi', '[a-z]+');
+----------------------------------------+
| REGEXP_SUBSTR('abc def ghi', '[a-z]+') |
+----------------------------------------+
| abc |
+----------------------------------------+
SELECT REGEXP_SUBSTR('abc def ghi', '[a-z]+', 1, 3);
+----------------------------------------------+
| REGEXP_SUBSTR('abc def ghi', '[a-z]+', 1, 3) |
+----------------------------------------------+
| ghi |
+----------------------------------------------+
SELECT REGEXP_SUBSTR('周 周周 周周周 周周周周', '周+', 2, 3);
+------------------------------------------------------------------+
| REGEXP_SUBSTR('周 周周 周周周 周周周周', '周+', 2, 3) |
+------------------------------------------------------------------+
| 周周周周 |
+------------------------------------------------------------------+
Returns a string consisting of the string str repeated count times. If count is less than 1, returns an empty string. Returns NULL if str or count are NULL.
func.repeat(<str>, <count>)
func.repeat(<str>, <count>)
+-------------------------+
| func.repeat('plaid', 3) |
+-------------------------+
| plaidplaidplaid |
+-------------------------+
REPEAT(<str>, <count>)
Arguments | Description |
---|---|
<str> | The string. |
<count> | The number. |
SELECT REPEAT('databend', 3);
+--------------------------+
| REPEAT('databend', 3) |
+--------------------------+
| databenddatabenddatabend |
+--------------------------+
SELECT REPEAT('databend', 0);
+-----------------------+
| REPEAT('databend', 0) |
+-----------------------+
| |
+-----------------------+
SELECT REPEAT('databend', NULL);
+--------------------------+
| REPEAT('databend', NULL) |
+--------------------------+
| NULL |
+--------------------------+
Returns the string str with all occurrences of the string from_str replaced by the string to_str.
func.replace(<str>, <from_str>, <to_str>)
func.replace(<str>, <from_str>, <to_str>)
+--------------------------------------+
| func.replace('plaidCloud', 'p', 'P') |
+--------------------------------------+
| PlaidCloud |
+--------------------------------------+
REPLACE(<str>, <from_str>, <to_str>)
Arguments | Description |
---|---|
<str> | The string. |
<from_str> | The from string. |
<to_str> | The to string. |
VARCHAR
SELECT REPLACE('www.mysql.com', 'w', 'Ww');
+-------------------------------------+
| REPLACE('www.mysql.com', 'w', 'Ww') |
+-------------------------------------+
| WwWwWw.mysql.com |
+-------------------------------------+
Returns the string str with the order of the characters reversed.
func.reverse(<str>)
func.reverse('abc')
+----------------------+
| func..reverse('abc') |
+----------------------+
| cba |
+----------------------+
REVERSE(<str>)
Arguments | Description |
---|---|
<str> | The string value. |
VARCHAR
SELECT REVERSE('abc');
+----------------+
| REVERSE('abc') |
+----------------+
| cba |
+----------------+
Returns the rightmost len characters from the string str, or NULL if any argument is NULL.
func.right(<str>, <len>)
func.right('foobarbar', 4)
+----------------------------+
| func.right('foobarbar', 4) |
+----------------------------+
| rbar |
+----------------------------+
RIGHT(<str>, <len>);
Arguments | Description |
---|---|
<str> | The main string from where the character to be extracted |
<len> | The count of characters |
VARCHAR
SELECT RIGHT('foobarbar', 4);
+-----------------------+
| RIGHT('foobarbar', 4) |
+-----------------------+
| rbar |
+-----------------------+
Alias for REGEXP.
Returns the string str, right-padded with the string padstr to a length of len characters. If str is longer than len, the return value is shortened to len characters.
func.rpad(<str>, <len>, <padstr>)
func.rpad('hi',5,'?')
+-----------------------+
| func.rpad('hi',5,'?') |
+-----------------------+
| hi??? |
+-----------------------+
func.rpad('hi',1,'?')
+-----------------------+
| func.rpad('hi',1,'?') |
+-----------------------+
| h |
+-----------------------+
RPAD(<str>, <len>, <padstr>)
Arguments | Description |
---|---|
<str> | The string. |
<len> | The length. |
<padstr> | The pad string. |
VARCHAR
SELECT RPAD('hi',5,'?');
+--------------------+
| RPAD('hi', 5, '?') |
+--------------------+
| hi??? |
+--------------------+
SELECT RPAD('hi',1,'?');
+--------------------+
| RPAD('hi', 1, '?') |
+--------------------+
| h |
+--------------------+
Generates the Soundex code for a string.
See also: SOUNDS LIKE
func.soundex(<str>)
func.soundex('PlaidCloud Lakehouse')
+--------------------------------------+
| func.soundex('PlaidCloud Lakehouse') |
+--------------------------------------+
| D153 |
+--------------------------------------+
SOUNDEX(<str>)
Arguments | Description |
---|---|
str | The string. |
Returns a code of type VARCHAR or a NULL value.
SELECT SOUNDEX('PlaidCloud Lakehouse');
---
D153
-- All non-alphabetic characters in the string are ignored.
SELECT SOUNDEX('PlaidCloud Lakehouse!');
---
D153
-- All international alphabetic characters outside the A-Z range are ignored unless they're the first letter.
SELECT SOUNDEX('PlaidCloud Lakehouse,你好');
---
D153
SELECT SOUNDEX('你好,PlaidCloud Lakehouse');
---
你3153
-- SUBSTR the result to get a standard Soundex code.
SELECT SOUNDEX('databend cloud'),SUBSTR(SOUNDEX('databend cloud'),1,4);
soundex('databend cloud')|substring(soundex('databend cloud') from 1 for 4)|
-------------------------+-------------------------------------------------+
D153243 |D153 |
SELECT SOUNDEX(NULL);
+-------------------------------------+
| `SOUNDEX(NULL)` |
+-------------------------------------+
| <null> |
+-------------------------------------+
Compares the pronunciation of two strings by their Soundex codes. Soundex is a phonetic algorithm that produces a code representing the pronunciation of a string, allowing for approximate matching of strings based on their pronunciation rather than their spelling. PlaidCloud Lakehouse offers the SOUNDEX function that allows you to get the Soundex code from a string.
SOUNDS LIKE is frequently employed in the WHERE clause of SQL queries to narrow down rows using fuzzy string matching, such as for names and addresses, see Filtering Rows in Examples.
func.sounds_like(<str1>, <str2>)
func..sounds_like('Monday', 'Sunday')
+---------------------------------------+
| func..sounds_like('Monday', 'Sunday') |
+---------------------------------------+
| 0 |
+---------------------------------------+
<str1> SOUNDS LIKE <str2>
Arguments | Description |
---|---|
str1, 2 | The strings you compare. |
Return a Boolean value of 1 if the Soundex codes for the two strings are the same (which means they sound alike) and 0 otherwise.
SELECT 'two' SOUNDS LIKE 'too'
----
1
SELECT CONCAT('A', 'B') SOUNDS LIKE 'AB';
----
1
SELECT 'Monday' SOUNDS LIKE 'Sunday';
----
0
SELECT * FROM employees;
id|first_name|last_name|age|
--+----------+---------+---+
0|John |Smith | 35|
0|Mark |Smythe | 28|
0|Johann |Schmidt | 51|
0|Eric |Doe | 30|
0|Sue |Johnson | 45|
SELECT * FROM employees
WHERE first_name SOUNDS LIKE 'John';
id|first_name|last_name|age|
--+----------+---------+---+
0|John |Smith | 35|
0|Johann |Schmidt | 51|
Returns a string consisting of N blank space characters.
func.space(<n>)
func.space(20)
+-----------------+
| func.space(20) |
+-----------------+
| |
+-----------------+
SPACE(<n>);
Arguments | Description |
---|---|
<n> | The number of spaces |
String data type value.
SELECT SPACE(20)
+----------------------+
| SPACE(20) |
+----------------------+
| |
+----------------------+
import FunctionDescription from '@site/src/components/FunctionDescription';
Splits a string using a specified delimiter and returns the resulting parts as an array.
See also: SPLIT_PART
func.split('<input_string>', '<delimiter>')
func.split('PlaidCloud Lakehouse', ' ')
+-----------------------------------------+
| func.split('PlaidCloud Lakehouse', ' ') |
+-----------------------------------------+
| ['PlaidCloud Lakehouse'] |
+-----------------------------------------+
SPLIT('<input_string>', '<delimiter>')
Array of strings. SPLIT returns NULL when either the input string or the delimiter is NULL.
-- Use a space as the delimiter
-- SPLIT returns an array with two parts.
SELECT SPLIT('PlaidCloud Lakehouse', ' ');
split('PlaidCloud Lakehouse', ' ')|
----------------------------------+
['PlaidCloud','Lakehouse'] |
-- Use an empty string as the delimiter or a delimiter that does not exist in the input string
-- SPLIT returns an array containing the entire input string as a single part.
SELECT SPLIT('PlaidCloud Lakehouse', '');
split('databend cloud', '')|
----------------------------------+
['PlaidCloud Lakehouse'] |
SELECT SPLIT('PlaidCloud Lakehouse', ',');
split('databend cloud', ',')|
----------------------------------+
['PlaidCloud Lakehouse'] |
-- Use ' ' (tab) as the delimiter
-- SPLIT returns an array with timestamp, log level, and message.
SELECT SPLIT('2023-10-19 15:30:45 INFO Log message goes here', ' ');
split('2023-10-19 15:30:45\tinfo\tlog message goes here', '\t')|
---------------------------------------------------------------+
['2023-10-19 15:30:45','INFO','Log message goes here'] |
import FunctionDescription from '@site/src/components/FunctionDescription';
Splits a string using a specified delimiter and returns the specified part.
See also: SPLIT
func.split_part('<input_string>', '<delimiter>', '<position>')
func.split_part('PlaidCloud Lakehouse', ' ', 1)
+-------------------------------------------------+
| func.split_part('PlaidCloud Lakehouse', ' ', 1) |
+-------------------------------------------------+
| PlaidCloud |
+-------------------------------------------------+
SPLIT_PART('<input_string>', '<delimiter>', '<position>')
The position argument specifies which part to return. It uses a 1-based index but can also accept positive, negative, or zero values:
String. SPLIT_PART returns NULL when either the input string, the delimiter, or the position is NULL.
-- Use a space as the delimiter
-- SPLIT_PART returns a specific part.
SELECT SPLIT_PART('PlaidCloud Lakehouse', ' ', 1);
split_part('PlaidCloud Lakehouse', ' ', 1)|
------------------------------------------+
PlaidCloud Lakehouse |
-- Use an empty string as the delimiter or a delimiter that does not exist in the input string
-- SPLIT_PART returns the entire input string.
SELECT SPLIT_PART('PlaidCloud Lakehouse', '', 1);
split_part('PlaidCloud Lakehouse', '', 1)|
-----------------------------------+
PlaidCloud Lakehouse |
SELECT SPLIT_PART('PlaidCloud Lakehouse', ',', 1);
split_part('PlaidCloud Lakehouse', ',', 1)|
------------------------------------+
PlaidCloud Lakehouse |
-- Use ' ' (tab) as the delimiter
-- SPLIT_PART returns individual fields.
SELECT SPLIT_PART('2023-10-19 15:30:45 INFO Log message goes here', ' ', 3);
split_part('2023-10-19 15:30:45 info log message goes here', ' ', 3)|
--------------------------------------------------------------------------+
Log message goes here |
-- SPLIT_PART returns an empty string as the specified part does not exist at all.
SELECT SPLIT_PART('2023-10-19 15:30:45 INFO Log message goes here', ' ', 4);
split_part('2023-10-19 15:30:45 info log message goes here', ' ', 4)|
--------------------------------------------------------------------------+
|
Returns 0 if the strings are the same, -1 if the first argument is smaller than the second, and 1 otherwise.
func.strcmp(<expr1> ,<expr2>)
func.strcmp('text', 'text2')
+------------------------------+
| func.strcmp('text', 'text2') |
+------------------------------+
| -1 |
+------------------------------+
func.strcmp('text2', 'text')
+------------------------------+
| func.strcmp('text2', 'text') |
+------------------------------+
| 1 |
+------------------------------+
func.strcmp('text', 'text')
+------------------------------+
| func.strcmp('text', 'text') |
+------------------------------+
| 0 |
+------------------------------+
STRCMP(<expr1> ,<expr2>)
Arguments | Description |
---|---|
<expr1> | The string. |
<expr2> | The string. |
BIGINT
SELECT STRCMP('text', 'text2');
+-------------------------+
| STRCMP('text', 'text2') |
+-------------------------+
| -1 |
+-------------------------+
SELECT STRCMP('text2', 'text');
+-------------------------+
| STRCMP('text2', 'text') |
+-------------------------+
| 1 |
+-------------------------+
SELECT STRCMP('text', 'text');
+------------------------+
| STRCMP('text', 'text') |
+------------------------+
| 0 |
+------------------------+
Extracts a string containing a specific number of characters from a particular position of a given string.
len
argument return a substring from string str
starting at position pos
.len
argument return a substring len
characters long from string str
, starting at position pos
.It is also possible to use a negative value for pos
. In this case, the beginning of the substring is pos characters from the end of the string, rather than the beginning. A negative value may be used for pos
in any of the forms of this function. A value of 0 for pos
returns an empty string. The position of the first character in the string from which the substring is to be extracted is reckoned as 1.
func.substr(<str>, <pos>, <len>)
func.substr('Quadratically', 5, 6)
+------------------------------------+
| func.substr('Quadratically', 5, 6) |
+------------------------------------+
| ratica |
+------------------------------------+
SUBSTR(<str>, <pos>)
SUBSTR(<str>, <pos>, <len>)
Arguments | Description |
---|---|
<str> | The main string from where the character to be extracted |
<pos> | The position (starting from 1) the substring to start at. If negative, counts from the end |
<len> | The maximum length of the substring to extract |
VARCHAR
SELECT
SUBSTRING('Quadratically', 5),
SUBSTR('Quadratically', 5),
MID('Quadratically', 5);
┌─────────────────────────────────────────────────────────────────────────────────────────────────┐
│ substring('quadratically' from 5) │ substring('quadratically' from 5) │ mid('quadratically', 5) │
├───────────────────────────────────┼───────────────────────────────────┼─────────────────────────┤
│ ratically │ ratically │ ratically │
└─────────────────────────────────────────────────────────────────────────────────────────────────┘
SELECT
SUBSTRING('Quadratically', 5, 6),
SUBSTR('Quadratically', 5, 6),
MID('Quadratically', 5, 6);
┌────────────────────────────────────────────────────────────────────────────────────────────────────────────────┐
│ substring('quadratically' from 5 for 6) │ substring('quadratically' from 5 for 6) │ mid('quadratically', 5, 6) │
├─────────────────────────────────────────┼─────────────────────────────────────────┼────────────────────────────┤
│ ratica │ ratica │ ratica │
└────────────────────────────────────────────────────────────────────────────────────────────────────────────────┘
Alias for SUBSTR.
Converts the string argument to base-64 encoded form and returns the result as a character string. If the argument is not a string, it is converted to a string before conversion takes place. The result is NULL if the argument is NULL.
func.to_base64(<v>)
func.to_base64('abc')
+-----------------------+
| func.to_base64('abc') |
+-----------------------+
| YWJj |
+-----------------------+
TO_BASE64(<v>)
Arguments | Description |
---|---|
<v> | The value. |
VARCHAR
SELECT TO_BASE64('abc');
+------------------+
| TO_BASE64('abc') |
+------------------+
| YWJj |
+------------------+
import FunctionDescription from '@site/src/components/FunctionDescription';
Transforms a given string by replacing specific characters with corresponding replacements, as defined by the provided mapping.
func.translate('<inputString>', '<charactersToReplace>', '<replacementCharacters>')
func.translate('databend', 'de', 'DE')
+----------------------------------------+
| func.translate('databend', 'de', 'DE') |
+----------------------------------------+
| DatabEnD |
+----------------------------------------+
TRANSLATE('<inputString>', '<charactersToReplace>', '<replacementCharacters>')
Parameter | Description |
---|---|
<inputString> | The input string to be transformed. |
<charactersToReplace> | The string containing characters to be replaced in the input string. |
<replacementCharacters> | The string containing replacement characters corresponding to those in <charactersToReplace> . |
-- Replace 'd' with '$' in 'databend'
SELECT TRANSLATE('databend', 'd', '$');
---
$ataben$
-- Replace 'd' with 'D' in 'databend'
SELECT TRANSLATE('databend', 'd', 'D');
---
DatabenD
-- Replace 'd' with 'D' and 'e' with 'E' in 'databend'
SELECT TRANSLATE('databend', 'de', 'DE');
---
DatabEnD
-- Remove 'd' from 'databend'
SELECT TRANSLATE('databend', 'd', '');
---
ataben
Returns the string without leading or trailing occurrences of the specified remove string. If remove string is omitted, spaces are removed.
The Analyze function automatically trims both leading and trailing spaces.
func.trim(str)
func.trim(' plaidcloud ')
+--------------------------------+
| func.trim(' plaidcloud ') |
+--------------------------------+
| 'plaidcloud' |
+--------------------------------+
TRIM([{BOTH | LEADING | TRAILING} [remstr] FROM ] str)
Please note that ALL the examples in this section will return the string 'databend'.
The following example removes the leading and trailing string 'xxx' from the string 'xxxdatabendxxx':
SELECT TRIM(BOTH 'xxx' FROM 'xxxdatabendxxx');
The following example removes the leading string 'xxx' from the string 'xxxdatabend':
SELECT TRIM(LEADING 'xxx' FROM 'xxxdatabend' );
The following example removes the trailing string 'xxx' from the string 'databendxxx':
SELECT TRIM(TRAILING 'xxx' FROM 'databendxxx' );
If no remove string is specified, the function removes all leading and trailing spaces. The following examples remove the leading and/or trailing spaces:
SELECT TRIM(' databend ');
SELECT TRIM(' databend');
SELECT TRIM('databend ');
Alias for UPPER.
For a string argument str, UNHEX(str) interprets each pair of characters in the argument as a hexadecimal number and converts it to the byte represented by the number. The return value is a binary string.
func.unhex(<expr>)
func.unhex('6461746162656e64')
+--------------------------------+
| func.unhex('6461746162656e64') |
+--------------------------------+
| 6461746162656E64 |
+--------------------------------+
UNHEX(<expr>)
SELECT UNHEX('6461746162656e64') as c1, typeof(c1),UNHEX('6461746162656e64')::varchar as c2, typeof(c2), FROM_HEX('6461746162656e64');
┌─────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────┐
│ c1 │ typeof(c1) │ c2 | typeof(c2) | from_hex('6461746162656e64') |
├───────────────────────────┼────────────────────────|──────────────────┤───────────────────|─────────────────────────────────┤
│ 6461746162656E64 │ binary │ databend | varchar | 6461746162656E64 |
└─────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────┘
SELECT UNHEX(HEX('string')), unhex(HEX('string'))::varchar;
┌──────────────────────────────────────────────────────┐
│ unhex(hex('string')) │ unhex(hex('string'))::varchar │
├──────────────────────┼───────────────────────────────┤
│ 737472696E67 │ string │
└──────────────────────────────────────────────────────┘
Returns a string with all characters changed to uppercase.
func.unhex(<expr>)
func.upper('hello, plaidcloud lakehouse!')
+--------------------------------------------+
| func.upper('hello, plaidcloud lakehouse!') |
+--------------------------------------------+
| 'HELLO, PLAIDCLOUD LAKEHOUSE!' |
+--------------------------------------------+
UPPER(<str>)
VARCHAR
SELECT UPPER('hello, databend!'), UCASE('hello, databend!');
┌───────────────────────────────────────────────────────┐
│ upper('hello, databend!') │ ucase('hello, databend!') │
├───────────────────────────┼───────────────────────────┤
│ HELLO, DATABEND! │ HELLO, DATABEND! │
└───────────────────────────────────────────────────────┘