split a single string at some given positions or patterns
chunks = strsplit(string) chunks = strsplit(string, indices) [chunks, matched_separators] = strsplit(string, separators) [chunks, matched_separators] = strsplit(string, separators, limit) [chunks, matched_separators] = strsplit(string, regexp) [chunks, matched_separators] = strsplit(string, regexp, limit)
[1, length(string)-1]
.
string
and used as scissors.
UTF8 extended characters are supported.
"/k.{2}o/"
length(indices)+1
elements = split chunks.
size(chunks,1)-1
:
matched separators or expression patterns.
string
. If this one includes more separators occurrences,
its unsplit tail is returned as last chunk in chunks($)
.
strsplit(string) splits string
into all its individual characters.
strsplit(string, indices) splits string
at the characters positions given in the indices
vector.
Characters at these indices are heads of returned chunks
.
strsplit(string, separators) splits string
at positions after any matching separator among
separators
strings.
Detected and used separators are removed from chunks tails.
strsplit(string, "")
is equivalent to
strsplit(string)
.
strsplit(string, regexp) does the same,
except that string
is parsed for the given regular expression
used as "generic separator", instead of for any "constant" separator among
a limited separators
set.
If string
starts with a matching separator or expression,
chunks(1)
is set to ""
.
If string
ends with a matching separator or expression,
""
is appended as last chunks
element.
If no matching separator or regexp is found in string
,
this one is returned as is in chunks
.
That will be noticeably the case for string=""
.
Without the limit
option, any string
including n
separators will be split into
n+1
chunks.
strsplit(string, separators, limit) or
strsplit(string, regexp, limit) will
search for a matching separator or expression for a maximum of
limit
times. If then there are remaining matches in
the unprocessed tail of string
, this tail is returned
as is in chunks($)
.
[chunks, matched_separators] = strsplit(string,…)
returns the column of the matched separators or expressions, in addition to
chunks
.
Then strcat([chunks' ; [matched_separators' ""]])
should be
equal to string
.
![]() | Comparison between strsplit() and tokens():
|
Split at given indices:
--> strsplit("Scilab")' ans = "S" "c" "i" "l" "a" "b" --> strsplit("αβδεϵζηθικλμνξοπρστυφϕχψω", [1 6 11]) ans = "α" "βδεϵζ" "ηθικλ" "μνξοπρστυφϕχψω"
Split at matching separators:
strsplit("aabcabbcbaaacacaabbcbccaaabcbc", "aa") // t starts with the separator => heading "" chunk // Consecutive separators are not squeezed: strsplit("abbcccdde", "c")' // With several possible separators: t = "aabcabbcbaaacacaabbcbccaaabcbc"; [c, s] = strsplit(t, ["aa" "bb"]); c', s' strcat([c';[s' ""]]) == t // Let's limit the number of split to 4, => 4 chunks + unprocessed tail: strsplit("aabcabbcbaaacacaabbcbccaaabcbc", ["aa" "bb"], 4) // Splitting a string ending with a separator yields a final "": strsplit("aabcabbcbaaacacaabbcbccaaabcbc", "cbc")' | ![]() | ![]() |
--> strsplit("aabcabbcbaaacacaabbcbccaaabcbc", "aa") // t starts with the separator => heading "" chunk ans = "" "bcabbcb" "acac" "bbcbcc" "abcbc" --> // Consecutive separators are not squeezed: --> strsplit("abbcccdde", "c")' ans = "abb" "" "" "dde" --> // With several possible separators: --> t = "aabcabbcbaaacacaabbcbccaaabcbc"; --> [c, s] = strsplit(t, ["aa" "bb"]); --> c', s' ans = "" "bca" "cb" "acac" "" "cbcc" "abcbc" ans = "aa" "bb" "aa" "aa" "bb" "aa" --> strcat([c';[s' ""]]) == t ans = T --> // Let's limit the number of split to 4, => 4 chunks + unprocessed tail: --> strsplit("aabcabbcbaaacacaabbcbccaaabcbc", ["aa" "bb"], 4)' ans = "" "bca" "cb" "acac" "bbcbccaaabcbc" --> // Splitting a string ending with a separator yields a final "": --> strsplit("aabcabbcbaaacacaabbcbccaaabcbc", "cbc")' ans = "aabcabbcbaaacacaabb" "caaab" ""
Use a regular expression as scissors:
[c, s] = strsplit("C:\Windows\System32\OpenSSH\", "/\\|:/"); c', s' [c, s] = strsplit("abcdef8ghijkl3mnopqr6stuvw7xyz", "/\d+/", 2); c', s' | ![]() | ![]() |
--> [c, s] = strsplit("C:\Windows\System32\OpenSSH\", "/\\|:/"); --> c', s' ans = "C" "" "Windows" "System32" "OpenSSH" "" ans = ":" "\" "\" "\" "\" --> [c, s] = strsplit("abcdef8ghijkl3mnopqr6stuvw7xyz", "/\d+/", 2); --> c', s' ans = "abcdef" "ghijkl" "mnopqr6stuvw7xyz" ans = "8" "3"