Table 11.6. String Operators
Name | Description |
---|---|
ASCII() |
Return numeric value of left-most character |
BIN() |
Return a string representation of the argument |
BIT_LENGTH() |
Return length of argument in bits |
CHAR_LENGTH() |
Return number of characters in argument |
CHAR() |
Return the character for each integer passed |
CHARACTER_LENGTH() |
A synonym for CHAR_LENGTH() |
CONCAT_WS() |
Return concatenate with separator |
CONCAT() |
Return concatenated string |
ELT() |
Return string at index number |
EXPORT_SET() |
Return a string such that for every bit set in the value bits, you get an on string and for every unset bit, you get an off string |
FIELD() |
Return the index (position) of the first argument in the subsequent arguments |
FIND_IN_SET() |
Return the index position of the first argument within the second argument |
FORMAT() |
Return a number formatted to specified number of decimal places |
HEX() |
Return a hexadecimal representation of a decimal or string value |
INSERT() |
Insert a substring at the specified position up to the specified number of characters |
INSTR() |
Return the index of the first occurrence of substring |
LCASE() |
Synonym for LOWER() |
LEFT() |
Return the leftmost number of characters as specified |
LENGTH() |
Return the length of a string in bytes |
LIKE |
Simple pattern matching |
LOAD_FILE() |
Load the named file |
LOCATE() |
Return the position of the first occurrence of substring |
LOWER() |
Return the argument in lowercase |
LPAD() |
Return the string argument, left-padded with the specified string |
LTRIM() |
Remove leading spaces |
MAKE_SET() |
Return a set of comma-separated strings that have the corresponding bit in bits set |
MATCH |
Perform full-text search |
MID() |
Return a substring starting from the specified position |
NOT LIKE |
Negation of simple pattern matching |
NOT REGEXP |
Negation of REGEXP |
OCTET_LENGTH() |
A synonym for LENGTH() |
ORD() |
Return character code for leftmost character of the argument |
POSITION() |
A synonym for LOCATE() |
QUOTE() |
Escape the argument for use in an SQL statement |
REGEXP |
Pattern matching using regular expressions |
REPEAT() |
Repeat a string the specified number of times |
REPLACE() |
Replace occurrences of a specified string |
REVERSE() |
Reverse the characters in a string |
RIGHT() |
Return the specified rightmost number of characters |
RLIKE |
Synonym for REGEXP |
RPAD() |
Append string the specified number of times |
RTRIM() |
Remove trailing spaces |
SOUNDEX() |
Return a soundex string |
SOUNDS LIKE (v4.1.0) |
Compare sounds |
SPACE() |
Return a string of the specified number of spaces |
STRCMP() |
Compare two strings |
SUBSTR() |
Return the substring as specified |
SUBSTRING_INDEX() |
Return a substring from a string before the specified number of occurrences of the delimiter |
SUBSTRING() |
Return the substring as specified |
TRIM() |
Remove leading and trailing spaces |
UCASE() |
Synonym for UPPER() |
UNHEX() (v4.1.2) |
Convert each pair of hexadecimal digits to a character |
UPPER() |
Convert to uppercase |
String-valued functions return NULL
if the
length of the result would be greater than the value of the
max_allowed_packet
system
variable. See Section 7.5.3, “Tuning Server Parameters”.
For functions that operate on string positions, the first position is numbered 1.
For functions that take length arguments, noninteger arguments are rounded to the nearest integer.
Returns the numeric value of the leftmost character of the
string str
. Returns
0
if str
is the
empty string. Returns NULL
if
str
is NULL
.
ASCII()
works for 8-bit
characters.
mysql>SELECT ASCII('2');
-> 50 mysql>SELECT ASCII(2);
-> 50 mysql>SELECT ASCII('dx');
-> 100
See also the ORD()
function.
Returns a string representation of the binary value of
N
, where
N
is a longlong
(BIGINT
) number. This is
equivalent to
CONV(
.
Returns N
,10,2)NULL
if
N
is NULL
.
mysql> SELECT BIN(12);
-> '1100'
Returns the length of the string
str
in bits.
mysql> SELECT BIT_LENGTH('text');
-> 32
CHAR(
N
,...
[USING charset_name
])
CHAR()
interprets each argument
N
as an integer and returns a
string consisting of the characters given by the code values
of those integers. NULL
values are skipped.
mysql>SELECT CHAR(77,121,83,81,'76');
-> 'MySQL' mysql>SELECT CHAR(77,77.3,'77.3');
-> 'MMM'
CHAR()
arguments larger than
255 are converted into multiple result bytes. For example,
CHAR(256)
is equivalent to
CHAR(1,0)
, and
CHAR(256*256)
is equivalent to
CHAR(1,0,0)
:
mysql>SELECT HEX(CHAR(1,0)), HEX(CHAR(256));
+----------------+----------------+ | HEX(CHAR(1,0)) | HEX(CHAR(256)) | +----------------+----------------+ | 0100 | 0100 | +----------------+----------------+ mysql>SELECT HEX(CHAR(1,0,0)), HEX(CHAR(256*256));
+------------------+--------------------+ | HEX(CHAR(1,0,0)) | HEX(CHAR(256*256)) | +------------------+--------------------+ | 010000 | 010000 | +------------------+--------------------+
By default, CHAR()
returns a
binary string. To produce a string in a given character set,
use the optional USING
clause:
mysql> SELECT CHARSET(CHAR(0x65)), CHARSET(CHAR(0x65 USING utf8));
+---------------------+--------------------------------+
| CHARSET(CHAR(0x65)) | CHARSET(CHAR(0x65 USING utf8)) |
+---------------------+--------------------------------+
| binary | utf8 |
+---------------------+--------------------------------+
If USING
is given and the result string is
illegal for the given character set, a warning is issued.
Also, if strict SQL mode is enabled, the result from
CHAR()
becomes
NULL
.
Returns the length of the string
str
, measured in characters. A
multi-byte character counts as a single character. This means
that for a string containing five two-byte characters,
LENGTH()
returns
10
, whereas
CHAR_LENGTH()
returns
5
.
CHARACTER_LENGTH()
is a synonym
for CHAR_LENGTH()
.
Returns the string that results from concatenating the arguments. May have one or more arguments. If all arguments are nonbinary strings, the result is a nonbinary string. If the arguments include any binary strings, the result is a binary string. A numeric argument is converted to its equivalent binary string form; if you want to avoid that, you can use an explicit type cast, as in this example:
SELECT CONCAT(CAST(int_col
AS CHAR),char_col
);
CONCAT()
returns
NULL
if any argument is
NULL
.
mysql>SELECT CONCAT('My', 'S', 'QL');
-> 'MySQL' mysql>SELECT CONCAT('My', NULL, 'QL');
-> NULL mysql>SELECT CONCAT(14.3);
-> '14.3'
For quoted strings, concatenation can be performed by placing the strings next to each other:
mysql> SELECT 'My' 'S' 'QL';
-> 'MySQL'
CONCAT_WS(
separator
,str1
,str2
,...)
CONCAT_WS()
stands for
Concatenate With Separator and is a special form of
CONCAT()
. The first argument is
the separator for the rest of the arguments. The separator is
added between the strings to be concatenated. The separator
can be a string, as can the rest of the arguments. If the
separator is NULL
, the result is
NULL
.
mysql>SELECT CONCAT_WS(',','First name','Second name','Last Name');
-> 'First name,Second name,Last Name' mysql>SELECT CONCAT_WS(',','First name',NULL,'Last Name');
-> 'First name,Last Name'
CONCAT_WS()
does not skip empty
strings. However, it does skip any NULL
values after the separator argument.
Returns str1
if
N
= 1
,
str2
if
N
= 2
, and so
on. Returns NULL
if
N
is less than 1
or greater than the number of arguments.
ELT()
is the complement of
FIELD()
.
mysql>SELECT ELT(1, 'ej', 'Heja', 'hej', 'foo');
-> 'ej' mysql>SELECT ELT(4, 'ej', 'Heja', 'hej', 'foo');
-> 'foo'
EXPORT_SET(
bits
,on
,off
[,separator
[,number_of_bits
]])
Returns a string such that for every bit set in the value
bits
, you get an
on
string and for every bit not set
in the value, you get an off
string. Bits in bits
are examined
from right to left (from low-order to high-order bits).
Strings are added to the result from left to right, separated
by the separator
string (the
default being the comma character
“,
”). The number of bits
examined is given by number_of_bits
(defaults to 64).
mysql>SELECT EXPORT_SET(5,'Y','N',',',4);
-> 'Y,N,Y,N' mysql>SELECT EXPORT_SET(6,'1','0',',',10);
-> '0,1,1,0,0,0,0,0,0,0'
Returns the index (position) of str
in the str1
,
str2
,
str3
, ...
list.
Returns 0
if str
is not found.
If all arguments to FIELD()
are
strings, all arguments are compared as strings. If all
arguments are numbers, they are compared as numbers.
Otherwise, the arguments are compared as double.
If str
is NULL
,
the return value is 0
because
NULL
fails equality comparison with any
value. FIELD()
is the
complement of ELT()
.
mysql>SELECT FIELD('ej', 'Hej', 'ej', 'Heja', 'hej', 'foo');
-> 2 mysql>SELECT FIELD('fo', 'Hej', 'ej', 'Heja', 'hej', 'foo');
-> 0
Returns a value in the range of 1 to
N
if the string
str
is in the string list
strlist
consisting of
N
substrings. A string list is a
string composed of substrings separated by
“,
” characters. If the first
argument is a constant string and the second is a column of
type SET
, the
FIND_IN_SET()
function is
optimized to use bit arithmetic. Returns 0
if str
is not in
strlist
or if
strlist
is the empty string.
Returns NULL
if either argument is
NULL
. This function does not work properly
if the first argument contains a comma
(“,
”) character.
mysql> SELECT FIND_IN_SET('b','a,b,c,d');
-> 2
Formats the number X
to a format
like '#,###,###.##'
, rounded to
D
decimal places, and returns the
result as a string. If D
is
0
, the result has no decimal point or
fractional part.
The optional third parameter enables a locale to be specified
to be used for the result number's decimal point, thousands
separator, and grouping between separators. Allowable locale
values are the same as the legal values for the
lc_time_names
system variable
(see Section 9.7, “MySQL Server Locale Support”). If no locale is
specified, the default is 'en_US'
.
mysql>SELECT FORMAT(12332.123456, 4);
-> '12,332.1235' mysql>SELECT FORMAT(12332.1,4);
-> '12,332.1000' mysql>SELECT FORMAT(12332.2,0);
-> '12,332' mysql>SELECT FORMAT(12332.2,2,'de_DE');
-> '12.332,20'
If N_or_S
is a number, returns a
string representation of the hexadecimal value of
N
, where
N
is a longlong
(BIGINT
) number. This is
equivalent to
CONV(
.
N
,10,16)
If N_or_S
is a string, returns a
hexadecimal string representation of
N_or_S
where each character in
N_or_S
is converted to two
hexadecimal digits. The inverse of this operation is performed
by the UNHEX()
function.
mysql>SELECT HEX(255);
-> 'FF' mysql>SELECT 0x616263;
-> 'abc' mysql>SELECT HEX('abc');
-> 616263
Returns the string str
, with the
substring beginning at position pos
and len
characters long replaced by
the string newstr
. Returns the
original string if pos
is not
within the length of the string. Replaces the rest of the
string from position pos
if
len
is not within the length of the
rest of the string. Returns NULL
if any
argument is NULL
.
mysql>SELECT INSERT('Quadratic', 3, 4, 'What');
-> 'QuWhattic' mysql>SELECT INSERT('Quadratic', -1, 4, 'What');
-> 'Quadratic' mysql>SELECT INSERT('Quadratic', 3, 100, 'What');
-> 'QuWhat'
This function is multi-byte safe.
Returns the position of the first occurrence of substring
substr
in string
str
. This is the same as the
two-argument form of LOCATE()
,
except that the order of the arguments is reversed.
mysql>SELECT INSTR('foobarbar', 'bar');
-> 4 mysql>SELECT INSTR('xbar', 'foobar');
-> 0
This function is multi-byte safe, and is case sensitive only if at least one argument is a binary string.
Returns the leftmost len
characters
from the string str
, or
NULL
if any argument is
NULL
.
mysql> SELECT LEFT('foobarbar', 5);
-> 'fooba'
Returns the length of the string
str
, measured in bytes. A
multi-byte character counts as multiple bytes. This means that
for a string containing five two-byte characters,
LENGTH()
returns
10
, whereas
CHAR_LENGTH()
returns
5
.
mysql> SELECT LENGTH('text');
-> 4
Reads the file and returns the file contents as a string. To
use this function, the file must be located on the server
host, you must specify the full path name to the file, and you
must have the FILE
privilege.
The file must be readable by all and its size less than
max_allowed_packet
bytes. If
the secure_file_priv
system
variable is set to a nonempty directory name, the file to be
loaded must be located in that directory.
If the file does not exist or cannot be read because one of
the preceding conditions is not satisfied, the function
returns NULL
.
The character_set_filesystem
system variable controls interpretation of file names that are
given as literal strings.
mysql>UPDATE t
SET blob_col=LOAD_FILE('/tmp/picture')
WHERE id=1;
LOCATE(
,
substr
,str
)LOCATE(
substr
,str
,pos
)
The first syntax returns the position of the first occurrence
of substring substr
in string
str
. The second syntax returns the
position of the first occurrence of substring
substr
in string
str
, starting at position
pos
. Returns 0
if substr
is not in
str
.
mysql>SELECT LOCATE('bar', 'foobarbar');
-> 4 mysql>SELECT LOCATE('xbar', 'foobar');
-> 0 mysql>SELECT LOCATE('bar', 'foobarbar', 5);
-> 7
This function is multi-byte safe, and is case-sensitive only if at least one argument is a binary string.
Returns the string str
with all
characters changed to lowercase according to the current
character set mapping. The default is
latin1
(cp1252 West European).
mysql> SELECT LOWER('QUADRATICALLY');
-> 'quadratically'
LOWER()
(and
UPPER()
) are ineffective when
applied to binary strings
(BINARY
,
VARBINARY
,
BLOB
). To perform lettercase
conversion, convert the string to a nonbinary string:
mysql>SET @str = BINARY 'New York';
mysql>SELECT LOWER(@str), LOWER(CONVERT(@str USING latin1));
+-------------+-----------------------------------+ | LOWER(@str) | LOWER(CONVERT(@str USING latin1)) | +-------------+-----------------------------------+ | New York | new york | +-------------+-----------------------------------+
This function is multi-byte safe.
Returns the string str
, left-padded
with the string padstr
to a length
of len
characters. If
str
is longer than
len
, the return value is shortened
to len
characters.
mysql>SELECT LPAD('hi',4,'??');
-> '??hi' mysql>SELECT LPAD('hi',1,'??');
-> 'h'
Returns the string str
with leading
space characters removed.
mysql> SELECT LTRIM(' barbar');
-> 'barbar'
This function is multi-byte safe.
Returns a set value (a string containing substrings separated
by “,
” characters) consisting
of the strings that have the corresponding bit in
bits
set.
str1
corresponds to bit 0,
str2
to bit 1, and so on.
NULL
values in
str1
,
str2
, ...
are
not appended to the result.
mysql>SELECT MAKE_SET(1,'a','b','c');
-> 'a' mysql>SELECT MAKE_SET(1 | 4,'hello','nice','world');
-> 'hello,world' mysql>SELECT MAKE_SET(1 | 4,'hello','nice',NULL,'world');
-> 'hello' mysql>SELECT MAKE_SET(0,'a','b','c');
-> ''
MID(
is a synonym for
str
,pos
,len
)SUBSTRING(
.
str
,pos
,len
)
OCTET_LENGTH()
is a synonym for
LENGTH()
.
If the leftmost character of the string
str
is a multi-byte character,
returns the code for that character, calculated from the
numeric values of its constituent bytes using this formula:
(1st byte code) + (2nd byte code × 256) + (3rd byte code × 2562) ...
If the leftmost character is not a multi-byte character,
ORD()
returns the same value as
the ASCII()
function.
mysql> SELECT ORD('2');
-> 50
POSITION(
is a synonym for
substr
IN str
)LOCATE(
.
substr
,str
)
Quotes a string to produce a result that can be used as a
properly escaped data value in an SQL statement. The string is
returned enclosed by single quotes and with each instance of
single quote (“'
”), backslash
(“\
”), ASCII
NUL
, and Control-Z preceded by a backslash.
If the argument is NULL
, the return value
is the word “NULL” without enclosing single
quotes.
mysql>SELECT QUOTE('Don\'t!');
-> 'Don\'t!' mysql>SELECT QUOTE(NULL);
-> NULL
Returns a string consisting of the string
str
repeated
count
times. If
count
is less than 1, returns an
empty string. Returns NULL
if
str
or
count
are NULL
.
mysql> SELECT REPEAT('MySQL', 3);
-> 'MySQLMySQLMySQL'
Returns the string str
with all
occurrences of the string from_str
replaced by the string to_str
.
REPLACE()
performs a
case-sensitive match when searching for
from_str
.
mysql> SELECT REPLACE('www.mysql.com', 'w', 'Ww');
-> 'WwWwWw.mysql.com'
This function is multi-byte safe.
Returns the string str
with the
order of the characters reversed.
mysql> SELECT REVERSE('abc');
-> 'cba'
This function is multi-byte safe.
Returns the rightmost len
characters from the string str
, or
NULL
if any argument is
NULL
.
mysql> SELECT RIGHT('foobarbar', 4);
-> 'rbar'
This function is multi-byte safe.
Returns the string str
,
right-padded with the string padstr
to a length of len
characters. If
str
is longer than
len
, the return value is shortened
to len
characters.
mysql>SELECT RPAD('hi',5,'?');
-> 'hi???' mysql>SELECT RPAD('hi',1,'?');
-> 'h'
This function is multi-byte safe.
Returns the string str
with
trailing space characters removed.
mysql> SELECT RTRIM('barbar ');
-> 'barbar'
This function is multi-byte safe.
Returns a soundex string from str
.
Two strings that sound almost the same should have identical
soundex strings. A standard soundex string is four characters
long, but the SOUNDEX()
function returns an arbitrarily long string. You can use
SUBSTRING()
on the result to
get a standard soundex string. All nonalphabetic characters in
str
are ignored. All international
alphabetic characters outside the A-Z range are treated as
vowels.
When using SOUNDEX()
, you
should be aware of the following limitations:
This function, as currently implemented, is intended to work well with strings that are in the English language only. Strings in other languages may not produce reliable results.
This function is not guaranteed to provide consistent
results with strings that use multi-byte character sets,
including utf-8
.
We hope to remove these limitations in a future release. See Bug#22638 for more information.
mysql>SELECT SOUNDEX('Hello');
-> 'H400' mysql>SELECT SOUNDEX('Quadratically');
-> 'Q36324'
This function implements the original Soundex algorithm, not the more popular enhanced version (also described by D. Knuth). The difference is that original version discards vowels first and duplicates second, whereas the enhanced version discards duplicates first and vowels second.
This is the same as
SOUNDEX(
.
expr1
)
= SOUNDEX(expr2
)
Returns a string consisting of N
space characters.
mysql> SELECT SPACE(6);
-> ' '
SUBSTR(
,
str
,pos
)SUBSTR(
,
str
FROM pos
)SUBSTR(
,
str
,pos
,len
)SUBSTR(
str
FROM pos
FOR
len
)
SUBSTR()
is a synonym for
SUBSTRING()
.
SUBSTRING(
,
str
,pos
)SUBSTRING(
,
str
FROM pos
)SUBSTRING(
,
str
,pos
,len
)SUBSTRING(
str
FROM pos
FOR
len
)
The forms without a len
argument
return a substring from string str
starting at position pos
. The forms
with a len
argument return a
substring len
characters long from
string str
, starting at position
pos
. The forms that use
FROM
are standard SQL syntax. It is also
possible to use a negative value for
pos
. In this case, the beginning of
the substring is pos
characters
from the end of the string, rather than the beginning. A
negative value may be used for pos
in any of the forms of this function.
For all forms of SUBSTRING()
,
the position of the first character in the string from which
the substring is to be extracted is reckoned as
1
.
mysql>SELECT SUBSTRING('Quadratically',5);
-> 'ratically' mysql>SELECT SUBSTRING('foobarbar' FROM 4);
-> 'barbar' mysql>SELECT SUBSTRING('Quadratically',5,6);
-> 'ratica' mysql>SELECT SUBSTRING('Sakila', -3);
-> 'ila' mysql>SELECT SUBSTRING('Sakila', -5, 3);
-> 'aki' mysql>SELECT SUBSTRING('Sakila' FROM -4 FOR 2);
-> 'ki'
This function is multi-byte safe.
If len
is less than 1, the result
is the empty string.
SUBSTRING_INDEX(
str
,delim
,count
)
Returns the substring from string
str
before
count
occurrences of the delimiter
delim
. If
count
is positive, everything to
the left of the final delimiter (counting from the left) is
returned. If count
is negative,
everything to the right of the final delimiter (counting from
the right) is returned.
SUBSTRING_INDEX()
performs a
case-sensitive match when searching for
delim
.
mysql>SELECT SUBSTRING_INDEX('www.mysql.com', '.', 2);
-> 'www.mysql' mysql>SELECT SUBSTRING_INDEX('www.mysql.com', '.', -2);
-> 'mysql.com'
This function is multi-byte safe.
TRIM([{BOTH | LEADING | TRAILING}
[
,
remstr
] FROM]
str
)TRIM([
remstr
FROM] str
)
Returns the string str
with all
remstr
prefixes or suffixes
removed. If none of the specifiers BOTH
,
LEADING
, or TRAILING
is
given, BOTH
is assumed.
remstr
is optional and, if not
specified, spaces are removed.
mysql>SELECT TRIM(' bar ');
-> 'bar' mysql>SELECT TRIM(LEADING 'x' FROM 'xxxbarxxx');
-> 'barxxx' mysql>SELECT TRIM(BOTH 'x' FROM 'xxxbarxxx');
-> 'bar' mysql>SELECT TRIM(TRAILING 'xyz' FROM 'barxxyz');
-> 'barx'
This function is multi-byte safe.
Performs the inverse operation of
HEX(
.
That is, it interprets each pair of hexadecimal digits in the
argument as a number and converts it to the character
represented by the number. The resulting characters are
returned as a binary string.
str
)
mysql>SELECT UNHEX('4D7953514C');
-> 'MySQL' mysql>SELECT 0x4D7953514C;
-> 'MySQL' mysql>SELECT UNHEX(HEX('string'));
-> 'string' mysql>SELECT HEX(UNHEX('1267'));
-> '1267'
The characters in the argument string must be legal
hexadecimal digits: '0'
..
'9'
, 'A'
..
'F'
, 'a'
..
'f'
. If
UNHEX()
encounters any
nonhexadecimal digits in the argument, it returns
NULL
:
mysql> SELECT UNHEX('GG');
+-------------+
| UNHEX('GG') |
+-------------+
| NULL |
+-------------+
A NULL
result can occur if the argument to
UNHEX()
is a
BINARY
column, because values
are padded with 0x00 bytes when stored but those bytes are not
stripped on retrieval. For example 'aa'
is
stored into a CHAR(3)
column as
'aa '
and retrieved as
'aa'
(with the trailing pad space
stripped), so UNHEX()
for the
column value returns 'A'
. By contrast
'aa'
is stored into a
BINARY(3)
column as
'aa\0'
and retrieved as
'aa\0'
(with the trailing pad
0x00
byte not stripped).
'\0'
is not a legal hexadecimal digit, so
UNHEX()
for the column value
returns NULL
.
Returns the string str
with all
characters changed to uppercase according to the current
character set mapping. The default is
latin1
(cp1252 West European).
mysql> SELECT UPPER('Hej');
-> 'HEJ'
UPPER()
is ineffective when
applied to binary strings
(BINARY
,
VARBINARY
,
BLOB
). The description of
LOWER()
shows how to perform
lettercase conversion of binary strings.
This function is multi-byte safe.
User Comments
The following ORDER BY will sort a column called Host (varchar 255) that contains either a Hostname or IP addresses, whenever the IP address is found it will be sorted not as string but as integers:
ORDER BY
CAST(SUBSTRING(
Host,
1,
LOCATE('.', Host) - 1)
AS UNSIGNED),
CAST(SUBSTRING(
Host,
LOCATE('.', Host) + 1,
LOCATE('.', Host, LOCATE('.', Host) + 1)
- LOCATE('.', Host) - 1)
AS UNSIGNED),
CAST(SUBSTRING(
Host,
LOCATE('.', Host, LOCATE('.', Host) + 1) + 1,
LOCATE('.', Host,
LOCATE('.', Host, LOCATE('.', Host) + 1) + 1)
- LOCATE('.', Host, LOCATE('.', Host) + 1) - 1)
AS UNSIGNED),
CAST(SUBSTRING(
Host,
LOCATE('.', Host, LOCATE('.', Host,
LOCATE('.', Host) + 1) + 1) + 1,
3)
AS UNSIGNED)
MySQL does not include a function to split a delimited string. Although separated data would normally be split into separate fields within a relation data, spliting such can be useful either during initial data load/validation or where such data is held in a text field.
The following formula can be used to extract the Nth item in a delimited list, in this case the 3rd item "ccccc" in the example comma separated list.
select replace(substring(substring_index('aaa,bbbb,ccccc', ',', 3), length(substring_index('aaa,bbbb,ccccc', ',', 3 - 1)) + 1), ',', '') ITEM3
The above formula does not need the first item to be handled as a special case and returns empty strings correctly when the item count is less than the position requested.
More on this and related matters can be found at http://www.kanolife.com/escape/2006/03/mysql-string-splitter.html
This will split an IP address ("a.b.c.d") into 4 respective octets:
SELECT
`ip` ,
SUBSTRING_INDEX( `ip` , '.', 1 ) AS a,
SUBSTRING_INDEX(SUBSTRING_INDEX( `ip` , '.', 2 ),'.',-1) AS b,
SUBSTRING_INDEX(SUBSTRING_INDEX( `ip` , '.', -2 ),'.',1) AS c,
SUBSTRING_INDEX( `ip` , '.', -1 ) AS d
FROM log_table
I found myself wanting a unique list of domain names from a table/column of fully qualified email addresses. There isn't a split function so using the other comments here I've devised this:
SELECT DISTINCT REVERSE(LEFT(REVERSE(email),LOCATE('@',REVERSE(email)) - 1)) AS domain FROM table ORDER BY domain
Reverses email, counts the characters from left minus the @. Reverses the reverse and returns 'domain.com'.
Perhaps there is a better/fast/easier way, however it's not easily found. So here is mine.
[name withheld], you could extract the domain in two simpler ways:
SELECT SUBSTRING_INDEX(email,'@',-1) AS domain FROM TABLE -- returns everything to the right of the rightmost @
SELECT SUBSTRING(email,INSTR(email,'@')+1) AS domain FROM TABLE -- returns everything to the right of the leftmost @
Both will return identical results on email addresses, since they only have one @ in them. I can't believe you didn't think of SUBSTRING_INDEX, even after the previous two comments used it :)
An example of how to make the first letter in a string uppercase - analogous to UCFIRST
SELECT CONCAT(UPPER(SUBSTRING(firstName, 1, 1)), LOWER(SUBSTRING(firstName FROM 2))) AS properFirstName
CONCAT_WS(' AND ', NULL, NULL)
returns empty string instead of NULL, so
SET @where_cond = CONCAT_WS(' AND ', @where1, @where2);
SET @sql = CONCAT_WS(' WHERE ', 'SELECT * FROM table', @where_cond);
SELECT @sql;
results in
SELECT * FROM table WHERE
if both @where1 and @where2 are NULL
For folks trying to lookup Countries associated with IPs (as in, e.g., databases found here: http://ip-to-country.webhosting.info/node/view/6), this should do the trick, building on an earlier contribution to this page (Assumes your IP is called "RemoteAddress"):
select RemoteAddress as IP,
( SUBSTRING_INDEX( RemoteAddress, '.', 1 ) * 16777216 +
SUBSTRING_INDEX(SUBSTRING_INDEX( RemoteAddress, '.', 2 ),'.',-1) * 65536 +
SUBSTRING_INDEX(SUBSTRING_INDEX( RemoteAddress, '.', -2 ),'.',1) * 256 +
SUBSTRING_INDEX( RemoteAddress, '.', -1 )
) AS IP2Num
FROM log
;
Here is a trick to create a simple horizontal graph:
SELECT ColName, EXPORT_SET(pow(2,round(ColName))-1,'+','-','',70) FROM TableName
This will create an area filled with "+", where the length of each "+" bar equals the number in column ColName in that row.
70 is an upper bound on the values in ColName; change it to match your actual data.
Correction to the previous tip: in the current version, EXPORT_SET does not create a string with more than 64 chars, even if you explicitly ask for 70 chars.
Another problem is that for numbers N > 53, 2^N - 1 equals 2^N because of rounding errors, so you will not see a bar, only a single "+".
I created a user-defined function in MySQL 5.0+ similar to PHP's substr_count(), since I could not find an equivalent native function in MySQL. (If there is one please tell me!!!)
delimiter ;delimiter ||
DROP FUNCTION IF EXISTS substrCount||
CREATE FUNCTION substrCount(s VARCHAR(255), ss VARCHAR(255)) RETURNS TINYINT(3) UNSIGNED LANGUAGE SQL NOT DETERMINISTIC READS SQL DATA
BEGIN
DECLARE count TINYINT(3) UNSIGNED;
DECLARE offset TINYINT(3) UNSIGNED;
DECLARE CONTINUE HANDLER FOR SQLSTATE '02000' SET s = NULL;
SET count = 0;
SET offset = 1;
REPEAT
IF NOT ISNULL(s) AND offset > 0 THEN
SET offset = LOCATE(ss, s, offset);
IF offset > 0 THEN
SET count = count + 1;
SET offset = offset + 1;
END IF;
END IF;
UNTIL ISNULL(s) OR offset = 0 END REPEAT;
RETURN count;
END;
Use like this:
SELECT substrCount('/this/is/a/path', '/') `count`;
`count` would return 4 in this case. Can be used in such cases where you might want to find the "depth" of a path, or for many other uses.
To [name withheld] who suggested a method for turning IP addresses into numbers, I would suggest that the INET_ATON() function is a little easier to use!
It's pretty easy to create your own string functions for many examples listed here
## Count substrings
CREATE FUNCTION substrCount(x varchar(255), delim varchar(12)) returns int
return (length(x)-length(REPLACE(x, delim, '')))/length(delim);
SELECT substrCount('/this/is/a/path', '/') as count;
SELECT substrCount('/this/is/a/path', 'is') as count;
## Split delimited strings
CREATE FUNCTION strSplit(x varchar(255), delim varchar(12), pos int) returns varchar(255)
return replace(substring(substring_index(x, delim, pos), length(substring_index(x, delim, pos - 1)) + 1), delim, '');
select strSplit("aaa,b,cc,d", ',', 2) as second;
select strSplit("a|bb|ccc|dd", '|', 3) as third;
select strSplit("aaa,b,cc,d", ',', 7) as 7th;
## Upper case first letter, UCFIRST or INITCAP
CREATE FUNCTION ucfirst(x varchar(255)) returns varchar(255)
return concat( upper(substring(x,1,1)),lower(substring(x,2)) );
select ucfirst("TEST");
##Or a more complicated example, this will repeat an insert after every nth position.
drop function insert2;
DELIMITER //
CREATE FUNCTION insert2(str text, pos int, delimit varchar(124))
RETURNS text
DETERMINISTIC
BEGIN
DECLARE i INT DEFAULT 1;
DECLARE str_len INT;
DECLARE out_str text default '';
SET str_len=length(str);
WHILE(i<str_len) DO
SET out_str=CONCAT(out_str, SUBSTR(str, i,pos), delimit);
SET i=i+pos;
END WHILE;
-- trim delimiter from end of string
SET out_str=TRIM(trailing delimit from out_str);
RETURN(out_str);
END//
DELIMITER ;
select insert2("ATGCATACAGTTATTTGA", 3, " ") as seq2;
I was trying to output a text message that told me if a field was null. I tried various mechanisms but this proved to do the trick.
elt(((field1 <=> NULL) + 1),"not null", "null")
evaluating (field1 <=> NULL) returns 0 (zero) if the field is not null and 1 (one) if the field is null. Adding 1 (one) to this result provides positional information that fits what 'elt' expects.
elt will return "not null" (position 1) if the evaluation of ((field1 <=> NULL) + 1) = 1
it will return "null" (position 2) if the evaluation of ((field1 <=> NULL) + 1) = 2
This can be altered to output messages based on any test that I've tried. Just remember that 'elt' returns null or 1 for a comparison so you need to add 1 (one) to that result to be able to choose between different messages.
As above I couldn't find a function for splitting strings based on a character set rather than string position, where the results were independent of substring lengths. I used this query to split out the Swiss-Prot accession numbers from BLAST result subject ID's, which are bracketed by pipe ('|') characters, but any two relatively unique characters should work.
select left(substring(<columnName>,locate('|',<columnName>)+1),
locate('|',substring(<columnName>,
locate('|',<columnName>)+1))-1)
as '<resultColumnName>' from <table>
This is the php function strip_tags
1 row in set (0.00 sec)delimiter ||
DROP FUNCTION IF EXISTS strip_tags||
CREATE FUNCTION strip_tags( x longtext) RETURNS longtext
LANGUAGE SQL NOT DETERMINISTIC READS SQL DATA
BEGIN
DECLARE sstart INT UNSIGNED;
DECLARE ends INT UNSIGNED;
SET sstart = LOCATE('<', x, 1);
REPEAT
SET ends = LOCATE('>', x, sstart);
SET x = CONCAT(SUBSTRING( x, 1 ,sstart -1) ,SUBSTRING(x, ends +1 )) ;
SET sstart = LOCATE('<', x, 1);
UNTIL sstart < 1 END REPEAT;
return x;
END;
||
delimiter ;
mysql> SELECT strip_tags('<a href="HelloWorld.html"><B>Hi, mate!</B></a>') as strip_tags;
If using statement-based replication, load_file() will fail on the slave unless the same file exists on the slave as on the master.
Here's my formula to remove a value from a string field of comma separated values. You can use a different delimiter just repalce the comma in the formula but pay attention since lot of commas here are the argument separator of the used functions.
The nice part is that with this formula you don't need to distinguish the first and last element of the string and this formula removes just the value_to_remove so if your initial string is: "4,11,34" and the value you want to remove is "4" you'll get "11,34".
UPDATE temp SET string = TRIM(BOTH ',' FROM REPLACE(CONCAT("," , string, ","), CONCAT(",",'value_to_remove', ",") , ',')) WHERE id=1
There is a simple way to convert the following Oracle usage of decode into MySql:
Oracle version:
select BU, count(line_number) total,
sum(decode(RECERTIFY_FLAG,'Y',1,0)) needed,
sum(decode(RECERTIFY_FLAG,'N',1,0)) not_needed,
sum(decode(RECERTIFY_FLAG,'Y',0,'N',0,1)) not_processed
from isf.isf_analog_line group by bu order by bu
MySql version that gives same results:
select BU, count(line_number) total,
sum(FIND_IN_SET(RECERTIFY_FLAG,'Y')) needed,
sum(FIND_IN_SET(RECERTIFY_FLAG,'N')) not_needed,
sum(FIND_IN_SET(RECERTIFY_FLAG,' ')) not_processed
from isf.isf_analog_line group by bu order by bu
Add your own comment.
I was looking for word_count("string") in mysql, finally came up with an user defined function which is very usefull for me, note: I used <space> for actual space.
1 row in set (0.00 sec)DROP FUNCTION IF EXISTS word_count;
CREATE FUNCTION word_count (f_string text(5000)) RETURNS smallint(10)
BEGIN
DECLARE new_string text(5000);
WHILE INSTR(f_string,'<space><space>')>0
DO
SET new_string=(select REPLACE(f_string,'<space><space>','<space>'));
SET f_string=new_string;
END WHILE;
RETURN (select LENGTH(TRIM(f_string))-LENGTH(REPLACE(TRIM(f_string),'<space>',''))+1);
END
//
Here is the result
mysql> select word_count("<space>Balaji<space><space><space> Devarajan<space>") WORD_COUNT;
mysql> select word_count(" Balaji Devarajan ") WORD_COUNT;
1 row in set (0.00 sec)
mysql> select word_count("Balaji Devarajan") WORD_COUNT;
1 row in set (0.01 sec)
Here is another function I wrote, which is very usefull in getting the domain name from the url, please bare with the spaces in the http : //, if not iam not able to submit this with many urls
1 row in set (0.00 sec)delimiter //
drop function if exists sub_domain;
CREATE FUNCTION sub_domain (url text(1000)) RETURNS CHAR(50)
BEGIN
DECLARE str1 varchar(10);
DECLARE str2 varchar(10);
SET str1=http : //;
SET str2=https://;
RETURN if(substring_index(substring_index(substring_index(REPLACE(url,str2,str1), '/',3),str1,-1),'.',-1) != 'com',substring_index(substring_index(substring_index(REPLACE(url,str2,str1), '/',3),str1,-1),'.',-3),substring_index(substring_index(substring_index(REPLACE(url,str2,str1), '/',3),str1,-1),'.',-2));
END;
//
mysql> select domain("http : //maps.google.com/maps?hl=en&q=what%20is%20domain%20name&btnG=Google+Search&ie=UTF-8&oe=UTF-8&um=1&sa=N&tab=il") DOMAIN;
similarly we can get the sub-domain too.
delimiter //
drop function if exists sub_domain;
CREATE FUNCTION sub_domain (url text(1000)) RETURNS CHAR(50)
BEGIN
DECLARE str1 varchar(10);
DECLARE str2 varchar(10);
SET str1=http : //;
SET str2=https : //;
RETURN if(substring_index(substring_index(substring_index(REPLACE(url,str2,str1), '/',3),str1,-1),'.',-1) != 'com',substring_index(substring_index(substring_index(REPLACE(url,str2,str1), '/',3),str1,-1),'.',-3),substring_index(substring_index(substring_index(REPLACE(url,str2,str1), '/',3),str1,-1),'.',-3));
END;
//
mysql> select sub_domain("http : //maps.google.com/maps?hl=en&q=what%20is%20domain%20name&btnG=Google+Search&ie=UTF-8&oe=UTF-8&um=1&sa=N&tab=il") sub_domain;
1 row in set (0.00 sec)
I was looking for function, to give me the MOST COMMON VALUE IN A STRING. Used Java or php to do this, using substrCount and SplitString (thanks to Chris Stubben). I got the below function, which will give me the mcv value.
1 row in set (0.02 sec)DELIMITER //
DROP FUNCTION IF EXISTS get_mcv;
CREATE FUNCTION get_mcv (list text(10000)) RETURNS text(1000)
BEGIN
DECLARE cnt int(10);
DECLARE iter_cnt int(10);
DECLARE item text(100);
DECLARE f_item text(100);
DECLARE prv_cnt int(10) default 0;
DECLARE nxt_cnt int(10) default 0;
IF list=' ' THEN
RETURN list;
END IF;
select substrCount(list,',')+1 into cnt;
SET iter_cnt = 1;
while cnt >= iter_cnt
do
select charsplit(list,',',iter_cnt) into item;
select substrCount(concat(',',list,','),concat(',',item,',')) into nxt_cnt;
IF nxt_cnt > prv_cnt THEN
SET prv_cnt = nxt_cnt;
SET f_item = item;
END IF;
set iter_cnt=iter_cnt+1;
end while;
RETURN f_item;
END
//
mysql> select get_mcv("dsfds,dsfds,fdfds,dfdsf,sd,df,df,df,df");
A field may contain delimited values that may be used with 'IN' operator in a where clause. However each of the values need to be nested within single quote(').
This function uses stringSplit and substrCount - thanks to Chris Stubben.
CREATE FUNCTION cs2in(x varchar(255), delim varchar(12)) returns varchar(255) deterministic
BEGIN
DECLARE retstr varchar(255);
DECLARE Valcount INT(10);
DECLARE v1 INT(10);
SET retstr = '';
SET Valcount = substrCount(x,delim)+1;
SET v1=0;
WHILE (v1 < Valcount) DO
SET retstr = concat_ws(',',retstr,quote(stringSplit(x,delim,v1+1)));
SET v1 = v1 + 1;
END WHILE;
SET retstr = CONCAT('(',TRIM(LEADING ',' FROM TRIM(retstr)),')');
RETURN retstr;
END
E.g.
mysql> Select cs2in('1,2,3,4,5',',') as IN_format;
This format is compatible for use in the 'IN' clause.
for a DIFFERENCE like function when comparing two strings together, try this:
------------------------------
DELIMITER $$
DROP FUNCTION IF EXISTS `db2`.`diff3`$$
CREATE DEFINER=`root`@`localhost` FUNCTION `diff3`(n1 varchar(50), n2 varchar(50)) RETURNS int(11)
DETERMINISTIC
BEGIN
declare s1 char(1);
declare s2 char(1);
declare s3 int;
declare s4 int;
declare l1 int;
declare diff int;
set l1 = 1;
set diff = 0;
repeat
set s1 = substring(soundex(n1),l1,1);
set s2 = substring(soundex(n2),l1,1);
if s1 = s2 then set diff = diff + 1;
end if;
set l1 = l1 + 1;
until l1 > 4
end repeat;
return diff;
END$$
DELIMITER ;
----------------------
other DBMS have this function and i kinda needed one. so looked and mysql's online docs shows a DIFFERENCE function but that was for GIS apps and isnt current implemented.
just change the "user@hostname" and the "db.function_name" to reflect your info.
returns an INT value from 0 to 4, where 0 means the SOUNDEX of each string doesnt have any same value. 4 means each 4 alphanumeric digit is the same:
select soundex("hello"), soundex("jello")
returns
H400 and J400
so DIFF3("hello", "jello")
returns a 3
while DIFF3("hello","great")
returns a 1
as a stored function, you can do something like:
select firstname from mytable where diff3("joe bob", firstname) > 3
Just some syntax to pass along for those who may wanna reformat credit cards (if you are brave enough to store them as plain text in the first place) before sending them to an application for displaying. This lets you pad out all the numbers except the last four (all you developers have done this millions of times I'm sure.)
SELECT LPAD(SUBSTRING(`cardnumbercolumn`, -4, 4),LENGTH(`cardnumbercolumn`),'*') FROM table
To format a number with leading zeros you can use LPAD:
eg: LPAD(13, 4, '0') returns '0013'
Any numbers that generate more than the number of digits (4 in this case) would be truncated from the left:
eg: LPAD(41278, 4, '0') returns '1278'
Simple Split
SELECT SUBSTRING_INDEX( SUBSTRING_INDEX( 'a|b|c|d|e|f|g|h', '|', index), '|', -1 );
A simple way to convert IP address from string '0A0B0C0D' into 10.11.12.13 :
CONCAT(
ASCII(UNHEX(SUBSTRING(yourIP,1,2))), '.',
ASCII(UNHEX(SUBSTRING(yourIP,3,2))), '.',
ASCII(UNHEX(SUBSTRING(yourIP,5,2))), '.',
ASCII(UNHEX(SUBSTRING(yourIP,7,2)))
)
I sometimes need to find the last occurrence of a string. Using a previous post I did this by implementing the following procedure
delimiter ||
DROP FUNCTION IF EXISTS locatelast||
CREATE FUNCTION locatelast(s VARCHAR(1000), ss VARCHAR(1000)) RETURNS TINYINT(3) UNSIGNED LANGUAGE SQL NOT DETERMINISTIC READS SQL DATA
BEGIN
DECLARE last TINYINT(3) UNSIGNED;
DECLARE offset TINYINT(3) UNSIGNED;
DECLARE CONTINUE HANDLER FOR SQLSTATE '02000' SET s = NULL;
SET last = 0;
SET offset = 1;
REPEAT
IF NOT ISNULL(s) AND offset > 0 THEN
SET offset = LOCATE(ss, s, offset);
IF offset > 0 THEN
SET last = offset;
SET offset = offset + 1;
END IF;
END IF;
UNTIL ISNULL(s) OR offset = 0 END REPEAT;
RETURN last;
END;
In response to Steven Benjamin on November 12 2007 7:53pm
Instead of looping through the string to look for the last occurrence, simply reverse() the string and look for the first occurrence, then substract the found position from the string length:
select @loc:=length(realname) - locate(" ", reverse(realname)) from table
For instance, looking for the last occurrence of a space?
select @string:="A horse with no name" as String, @loc:=length(@string) - locate(" ", reverse(@string))+1 AS lastoccurrence, left(@string, @loc), substr(@string,@loc+1);
It is important to check the value of group_concat_max_len
where using GROUP_CONCAT.
The default is 8192 (bytes), and if the result is bigger, it will be silently cropped, leading to unexpected results.
Some examples here: http://confronte.com.ar/groupconcat
If you want to compare an empty string to a numeric value or an integer field, you'll have to CAST the integer field or value to a string, due to the fact that for mysql, a zero (the integer one) equals to an empty string
Example :
SELECT 0 = '';
==> 1
SELECT '0' = '';
==> 0
SELECT CAST(0 AS CHAR) = '';
==> 0
This is common when you want to check user input : if a user inputs a "0" for a field, the check without cast will fail because mysql thinks this is an empty string.
Just checked a lot of documentation and also didn't find any solution to split delimited string into an array (temporary table).
I know the example where it is needed to walk through the input string to find dilimiters but I think this way annoying.
You may use my solution from below which is also keeping the order of substring occurance:
(input string is stored in "input" variable)
CREATE TEMPORARY TABLE temp (id TINYINT NOT NULL auto_increment, val CHAR(20), PRIMARY KEY(id));
SET input=REPLACE(input, ",", "'),('");
SET @dyn_sql=CONCAT("INSERT INTO temp (val) VALUES ('",input,"');");
PREPARE s1 FROM @dyn_sql; EXECUTE s1;
SELECT * FROM temp;
Hope this would help someone :)
In response to Denis Roy on November 15 2007 8:46pm
That is a great example. Here is how I used a very similar example to find a contact's last name from a contacts database by sub string on the last instance of the space:
SELECT @string:=Full_Name, SUBSTRING(@string, LENGTH(SUBSTRING(REVERSE(@string), LOCATE(" ", REVERSE(@string))-1))) AS Last_Name FROM contacts;
Something that would more elegant would be to have the LOCATE function include a direction option, like:
SELECT SUBSTRING(foo, LOCATE(" ", foo, RIGHT)) AS bar;
if you want to remove the tab(\t)
try this
update temp set locus=trim(BOTH ' ' from locus)
dudes & dudettes,
if you want to find the last occurrence of a particular string, use the tools mysql provides for you:
select reverse( substring( reverse( field ), locate( 'xyz', reverse( field ) )+1 ) )
---
this is way easier to implement and debug
As i was in charge to support soundex compatible string comparison i had made following first mysql function.
Maybe it will help someone to find it's own and better solutions ;)
CREATE FUNCTION SOUNDEX_SEARCHTEXT( haystack LONGTEXT, needle VARCHAR(40) ) RETURNS INT
BEGIN
DECLARE part VARCHAR(40) DEFAULT SUBSTRING( haystack, 1, LENGTH(needle) );
DECLARE iterator INT DEFAULT 1;
WHILE iterator < LENGTH( haystack )-LENGTH(needle)+1 DO
SET iterator = iterator + 1;
SET part = SUBSTRING( haystack, iterator, LENGTH(needle) );
IF part SOUNDS LIKE needle THEN
RETURN 1;
END IF;
END WHILE;
RETURN 0;
END
Hope it will be usefull - with best regards
I was going thru the example given in earlier post about the INITCAP/UCFIRST fucnctionality. It was very useful for the strings without space.. below procedure would useful for those string having space in it.
DELIMITER $$
DROP FUNCTION IF EXISTS `initcap`$$
CREATE DEFINER=`root`@`%` FUNCTION `initcap`(x varchar(255)) RETURNS varchar(255) CHARSET utf8
begin
set @l_str='';
set @r_str='';
if x REGEXP ' ' then
SELECT SUBSTRING_INDEX(x, ' ',1) into @l_str;
SELECT SUBSTRING_INDEX(x, ' ',-1) into @r_str;
return concat(concat( upper(substring(@l_str,1,1)),lower(substring(@l_str,2))),' ',concat( upper(substring(@r_str,1,1)),lower(substring(@r_str,2))));
else
return concat( upper(substring(x,1,1)),lower(substring(x,2)));
end if;
end$$
DELIMITER ;
In response to robyn wyrick on February 29 2008 7:01pm and Denis Roy on November 15 2007 8:46pm
Thank you for the hint!
Follow your examples I generate the following statement:
RIGHT(foo, locate(' ', REVERSE(foo))-1) as foo
Hello,
3 rows in set (0.00 sec)There are several stored procedures (e.g., see post of Grigory Dmitrenko) to transform
string like 'a,b,c' into something to be used like
....WHERE x IN ('a','b','c').
Here is solution to transform delimited string
into real table. Thus, it can be used further in
JOIN or IN constructs.
Pre-condition -- existence of some table with row count
bigger than elements in string. I will use
3-row table created on-fly by UNION.
mysql> select substring_index(substring_index('a,b,c', ',',@r:=@r+1),',',-1) zxz
from (select @r:=0) x,
(select 'x' xx union select 'v' xx union select 'z' xx) z;
Thanks ben j for double *substring_index* example posted above.
Chop off last 3 characters from a string. Cannot find an official way of doing this! This method works:
SELECT SUBSTRING(foobar,1,LENGTH(foobar)-3) AS foobar
I find this handy to extract the domain name (including subdomain) from an URL:
SELECT SUBSTRING_INDEX(SUBSTRING_INDEX(url,'://',-1),'/',1)
FROM urls
It works for URLs with and without http(s). But doesn't work if you have local URLs without a leading slash like "folder/index.html". In that case it extracts "folder" instead of an empty string.
I recently found that after filling a table with info from a csv file created with MS Excel, an unwanted character went into the end of a field, and it showed up as "^M". So, when I issued a
mysql> select description from catalog;
the list looked all garbled
I used next command to remove it (most, but not all of the rows in the table were contaminated):
mysql> update catalog set description = left(description,length(description) -1) where description like "%^M%";
Please note that to replicate that "^M" you must press <ctrl> and v --though no character will be displayed-- and then <ctrl> and m.
You can use REVERSE to parse the last token from a string. This can be useful for name processing, for instance (first name is everything except last token and last name is last token):
SELECT
REVERSE(SUBSTR(REVERSE(name),INSTR(REVERSE(name),' ')+1)) AS first,
REVERSE(SUBSTR(REVERSE(name),1,INSTR(REVERSE(name),' '))) AS last
FROM table
Thanks Ilde for sharing your <control><m> example and especially how to type the <control><m> into the console.
One minor change, since you are only replacing the one character at the end is, change the where clause to
where description like "%^M";
Was looking around trying to find out how to use the initcap functionality that Oracle has but mysql does not and came across the two suggestions posted above. The second example, however, is still not 100% accurate, as if there are more than 2 words in the data, it will capitalize the first and last words, and completely delete anything in between! I wanted to write a full version of this method:
-- ***********************************
DELIMITER $$
DROP FUNCTION IF EXISTS `initcap` $$
CREATE FUNCTION `initcap`(x char(30)) RETURNS char(30) CHARSET utf8
BEGIN
SET @str='';
SET @l_str='';
WHILE x REGEXP ' ' DO
SELECT SUBSTRING_INDEX(x, ' ', 1) INTO @l_str;
SELECT SUBSTRING(x, LOCATE(' ', x)+1) INTO x;
SELECT CONCAT(@str, ' ', CONCAT(UPPER(SUBSTRING(@l_str,1,1)),LOWER(SUBSTRING(@l_str,2)))) INTO @str;
END WHILE;
RETURN LTRIM(CONCAT(@str, ' ', CONCAT(UPPER(SUBSTRING(x,1,1)),LOWER(SUBSTRING(x,2)))));
END $$
DELIMITER ;
-- ***********************************
One gotcha to note: this method strips out any leading and trailing spaces from the input, which really isn't that big of a deal, but something to keep in mind.
The ^M character is the DOS EOL character - and you can avoid the entire problem by dumping the file from Excel to a CSV file - if you're running in UNIX/Linux you can use the "dos2unix" utility which will strip out the ^M's and leave you with a portable file. Just my two cents.
I modified the function strSplit(from Chris Stubben) for utf8.
DROP FUNCTION IF EXISTS strSplit;
CREATE FUNCTION strSplit(x varchar(21845), delim varchar(255), pos int) returns varchar(255)
return replace(
replace(
substring_index(x, delim, pos),
substring_index(x, delim, pos - 1),
''
),
delim,
''
);
mysql> select strSplit("あ|ええ|いいい|おおお", '|', 4) as 4th;
re: to get the element at position x in a string..
instead of using the command:
select replace(
substring_index(field, delim, pos),
substring_index(field, delim, pos - 1),
'')
from table;
you can shorten it to require less modifiying by doing:
select substring_index(
substring_index(field, 'xyz', pos)
, 'xyz', -1)
from table;
that will get the last element of the list of x that were found. which should be the one you want.
hope this helps!
Here is another example of how to transform a comma separated string into a list of values that can be used in a WHERE x IN ('a','b','c') clause:
SELECT * from foo
WHERE FIND_IN_SET(col, REPLACE('a, b, c', ' ', '')) != 0;
count of '@' char:
SELECT
stringfield,
LENGTH(stringfield)-LENGTH(REPLACE(stringfield,'@',''))
FROM tablename
Here is a function I created to simulate PHP's ucwords, I know it can be improved so you're welcome to do so:
CREATE FUNCTION `ucwords`(`string` text) RETURNS text CHARSET latin1
BEGIN
DECLARE final_string text default '';
DECLARE char_index int unsigned default 1;
DECLARE upperchar int unsigned default 1;
IF LENGTH(string)>0 THEN
IF LOCATE(' ',string) > 0 OR LOCATE('.',string) OR LOCATE('(',string) > 0 OR LOCATE('¿',string) THEN
REPEAT
IF upperchar = 1 THEN
SET final_string = CONCAT(final_string,UPPER(SUBSTRING(string,char_index,1)));
SET upperchar = 0;
ELSE
SET final_string = CONCAT(final_string,SUBSTRING(string,char_index,1));
END IF;
IF (SUBSTRING(string,char_index,1) = ' ') OR (SUBSTRING(string,char_index,1) = '.') OR (SUBSTRING(string,char_index,1) = '(') OR (SUBSTRING(string,char_index,1) = '¿') THEN
SET upperchar = 1;
END IF;
SET char_index = char_index + 1;
UNTIL char_index > LENGTH(string)
END REPEAT;
ELSE
SET final_string = CONCAT(UPPER(SUBSTRING(string,1,1)),SUBSTRING(string,2));
END IF;
ELSE
SET final_string = string;
END IF;
RETURN final_string;
END
I needed a way to parse UTF-8 strings into words, not finding any mechanism that would allow me to specify a list of characters to split on I hit upon using regexp and string manipulation to parse the string. The following is a function to find regex defined positions in a string and a procedure to break words out of a string based on regex.
delimiter $$
--
-- This function will return the first position in p_str where the regexp is true
--
drop function if exists regexPos $$
create function regexPos( p_str TEXT, p_regex varchar(250) ) returns int
BEGIN
declare v_pos int;
declare v_len int;
set v_pos=1;
set v_len=1+char_length( p_str );
while (( substr( p_str, 1, v_pos) REGEXP p_regex)=0 and (v_pos<v_len))
do
set v_pos = v_pos + 1;
end while;
return v_pos-1;
end $$
--
-- This procedure parses p_str into words based on the regular expression p_regex.
-- The simplest usage is call ParseWords( "some string", "[[:space:]]" );
-- this will break the string on spaces.
CREATE procedure ParseWords (IN p_str TEXT, IN p_regex varchar(256))
begin
declare v_startPos int;
declare v_strLen int;
declare v_wordLen int;
set v_startPos=1;
set v_strLen=char_length( p_str )+1;
while ( v_startPos < v_strLen )
do
set v_wordLen = regexPos( substring( p_str, v_startPos ), p_regex );
while (v_wordLen = 0) and ( v_startPos < v_strLen)
do
-- if the wordLen is 0 (zero) then we have a space at the start
-- so remove it and try again.
set v_startPos = v_startPos+1;
set v_wordLen = regexPos( substring( p_str, v_startPos ), p_regex );
end while;
if (v_wordLen > 0)
then
-- we found a word.
-- do something useful here. This example simply prints out the words
-- a real application will probably insert them into a table.
select substring( p_str, v_startPos, v_wordLen );
set v_startPos = v_startPos + v_wordLen +1;
end if;
end while;
end $$
delimiter ;
I was in need of a way to find the start position of a regex match in a string. I tried using regexPos above, but it seemed to be returning the end position of the match (minus 1). So, I modified it and called it rendpos, and used it to find the start position.
DELIMITER $$
DROP FUNCTION IF EXISTS `rendpos` $$
CREATE DEFINER=`root`@`localhost` FUNCTION `rendpos`(p_regex varchar(250),p_str TEXT) RETURNS int(11)
BEGIN
declare v_endpos int;
declare v_startpos int;
declare v_len int;
set v_endpos=1;
set v_len=1+char_length( p_str );
while (( substr( p_str, 1, v_endpos) REGEXP p_regex)=0 and (v_endpos<v_len))
do
set v_endpos = v_endpos + 1;
end while;
return v_endpos;
END $$
DELIMITER ;
Here is a quick and dirty find of start position. It will find the minimal match instead of the maximal pattern match. Please feel free to modify this to find the maximal pattern match.
DELIMITER $$
DROP FUNCTION IF EXISTS `rlocate` $$
CREATE DEFINER=`root`@`localhost` FUNCTION `rlocate`(p_regex varchar(250),p_str TEXT, s_startpos int) RETURNS int(11)
BEGIN
declare v_endpos int;
declare v_startpos int;
declare v_len int;
set v_endpos=rendpos(p_regex, p_str, s_startpos);
set v_startpos = v_endpos;
while (( substr( p_str, v_startpos, v_endpos) REGEXP p_regex)=0 and (v_startpos > 0))
do
set v_startpos = v_startpos - 1;
end while;
return v_startpos;
END $$
DELIMITER ;
The extract uses the above two functions, so it will likewise extract the minimal pattern.
DELIMITER $$
DROP FUNCTION IF EXISTS `rextract` $$
CREATE DEFINER=`root`@`localhost` FUNCTION `rextract`(p_regex varchar(250),p_str TEXT, s_startpos int) RETURNS varchar(250) CHARSET latin1
begin
declare startpos int;
declare endpos int;
set startpos = rlocate(p_regex,p_str,s_startpos);
set endpos = rendpos(p_regex,p_str,s_startpos);
return mid(p_str,startpos,endpos-startpos+1);
end $$
DELIMITER ;
Add your own comment.