Skip to content

String functions

CHARACTER_LENGTH(s) | CHAR_LENGTH(s)
length of string, in UTF8 characters
CONCAT(val1, val2, ...)
Concatenates all arguments after converting to text. Unlike the || operator, CONCAT ignores NULL arguments instead of returning NULL.
CONCAT('Hello', ' ', 'World')
-- 'Hello World'
CONCAT('Hello', NULL, ' World')
-- 'Hello World'
CONCAT(NULL, NULL)
-- ''
CONCAT('Value: ', 42)
-- 'Value: 42'
FORMAT(format_str, ...)
Formats arguments according to a format string (PostgreSQL-compatible). Format specifiers: %[position][flags][width]type
  • position: n$ where n is 1-based argument index
  • flags: - for left-justify within width
  • width: minimum field width (number, * for next arg, or *n$ for positional arg)
  • type:
  • s - format as string
  • I - format as SQL identifier (double-quoted if needed)
  • L - format as SQL literal (single-quoted, with proper escaping)
  • % - output a literal %
Examples:
FORMAT('Hello %s', 'World')
-- 'Hello World'
FORMAT('Testing %s, %s, %s', 'one', 'two', 'three')
-- 'Testing one, two, three'
FORMAT('INSERT INTO %I VALUES(%L)', 'Foo bar', 'O''Reilly')
-- 'INSERT INTO "Foo bar" VALUES(''O''''Reilly'')'
FORMAT('|%10s|', 'foo')
-- '| foo|'
FORMAT('|%-10s|', 'foo')
-- '|foo |'
FORMAT('Testing %3$s, %2$s, %1$s', 'one', 'two', 'three')
-- 'Testing three, two, one'
Notes:
  • %I throws an error if the argument is null
  • %L outputs NULL (unquoted) for null arguments
  • A specifier without a position uses the next argument after the last consumed
str [NOT] LIKE like_pattern
Returns true iff the str matches (/ doesn’t match) the like_pattern. like_pattern can contain:
  • _: matches any single character
  • %: matches 0-n characters
str [NOT] LIKE_REGEX regex [FLAG flags]
Returns true iff the str matches (/ doesn’t match) the regex. See Regular expressions in XTDB for more details.
REPLACE(s, target, replacement)
Replace all occurrences of target in s with replacement (literal string matching).
REGEXP_REPLACE(s, pattern, replacement [, flags])
Replace all occurrences of pattern in s with replacement (regex matching). See Regular expressions in XTDB for more details.
LOWER(str)
lower-case
OVERLAY(str PLACING replacement FROM start_pos [FOR length])
replace length characters of str starting at start_pos with replacement
  • start_pos: 1-based start position
  • length: defaults to end-of-string if not provided
POSITION(search IN str [USING CHARACTERS])
position of search within str, in characters
  • Return value is 1-based.
  • Returns 0 if not found.
TRIM([trim_char FROM] str) | TRIM(BOTH [trim_char] FROM str)
remove any occurrences of trim_char from the start and end of str
  • trim_char: single character (defaults to ‘space’).
TRIM(LEADING [trim_char] FROM str)
remove any occurrences of trim_char from the start of str
  • trim_char: single character (defaults to ‘space’).
TRIM(TRAILING [trim_char] FROM str)
remove any occurrences of trim_char from the end of `str
  • trim_char: single character (defaults to ‘space’).
OCTET_LENGTH(s)
length of string, in octets
POSITION(search IN str USING OCTETS)

position of search within str, in octets

Returns 0 if not found.

SUBSTRING(str FROM from_pos) | SUBSTRING(str FROM from_pos FOR length)
Returns the sub-string of the given str from from_pos for length characters
  • from_pos: 1-based start position
  • length: defaults to end-of-string if not provided
REVERSE(str)
reverses the characters in the string
PARSE_IDENT(qualified_name) (v2.2+)
splits a qualified SQL identifier on ., returning an array of its parts.
  • Unquoted parts are folded to lowercase; double-quoted parts preserve case.
  • "" within a quoted part is an escaped double quote.
  • Throws on unterminated quotes, empty identifiers, or unexpected characters.
  • PostgreSQL-compatible.
PARSE_IDENT('public.USERS') -- ['public', 'users']
PARSE_IDENT('"My Schema"."tbl"') -- ['My Schema', 'tbl']
QUOTE_IDENT(name) (v2.2+)
returns name formatted as a SQL identifier — double-quoted with " escaped to "" if necessary, bare otherwise.
  • PostgreSQL-compatible.
QUOTE_IDENT('users') -- 'users'
QUOTE_IDENT('User Table') -- '"User Table"'
QUOTE_IDENT('from') -- '"from"' (reserved word)
STRING_TO_ARRAY(str, delimiter) (v2.2+)
splits str on delimiter, returning an array of strings.
  • delimiter is matched as a literal string (not a regex).
  • If delimiter is NULL, str is split into its individual characters.
  • If delimiter is the empty string, returns a single-element array containing str.
  • If str is NULL, returns NULL.
  • PostgreSQL-compatible.
STRING_TO_ARRAY('a,b,c', ',')
-- ['a', 'b', 'c']
STRING_TO_ARRAY('abc', NULL)
-- ['a', 'b', 'c']
UPPER(str)
upper-case

Regular expressions (‘regexes’) in XTDB

Section titled “Regular expressions (‘regexes’) in XTDB”

XTDB regular expressions use Java’s Pattern syntax.

Supported flags (string, e.g. 'im') are as follows:

  • s: ‘dot’ matches any character (including line terminators)
  • i: case insensitive
  • m: multi-line