module Str = struct ... end
Types | |||||||
regexp |
The type of compiled regular expressions. Abstract |
||||||
split_result |
|
Functions |
regexp
: string -> regexp |
$^.*+?[]
. The following constructs are recognized:.
matches any character except newline*
(postfix) matches the previous expression zero, one or
several times+
(postfix) matches the previous expression one or
several times?
(postfix) matches the previous expression once or
not at all[..]
character set; ranges are denoted with -
, as in [a-z]
;
an initial ^
, as in [^0-9]
, complements the set^
matches at beginning of line$
matches at end of line\|
(infix) alternative between two expressions\(..\)
grouping and naming of the enclosed expression\1
the text matched by the first \(...\)
expression
(\2
for the second expression, etc)\b
matches word boundaries\
quotes special characters.
regexp_case_fold
: string -> regexp |
regexp
, but the compiled expression will match text
in a case-insensitive way: uppercase and lowercase letters will
be considered equivalent.
quote
: string -> string |
Str.quote s
returns a regexp string that matches exactly
s
and nothing else.
regexp_string
: string -> regexp |
regexp_string_case_fold
: string -> regexp |
Str.regexp_string s
returns a regular expression
that matches exactly s
and nothing else.
Str.regexp_string_case_fold
is similar, but the regexp
matches in a case-insensitive way.
string_match
: pat:regexp -> string -> pos:int -> bool |
string_match r s start
tests whether the characters in s
starting at position start
match the regular expression r
.
The first character of a string has position 0
, as usual.
search_forward
: pat:regexp -> string -> pos:int -> int |
search_forward r s start
searchs the string s
for a substring
matching the regular expression r
. The search starts at position
start
and proceeds towards the end of the string.
Return the position of the first character of the matched
substring, or raise Not_found
if no substring matches.
search_backward
: pat:regexp -> string -> pos:int -> int |
search_forward
, but the search proceeds towards the
beginning of the string.
string_partial_match
: pat:regexp -> string -> pos:int -> bool |
string_match
, but succeeds whenever the argument
string is a prefix of a string that matches. This includes
the case of a true complete match.
matched_string
: string -> string |
matched_string s
returns the substring of s
that was matched
by the latest string_match
, search_forward
or search_backward
.
The user must make sure that the parameter s
is the same string
that was passed to the matching or searching function.
match_beginning
: unit -> int |
match_end
: unit -> int |
match_beginning()
returns the position of the first character
of the substring that was matched by string_match
,
search_forward
or search_backward
. match_end()
returns
the position of the character following the last character of
the matched substring.
matched_group
: int -> string -> string |
matched_group n s
returns the substring of s
that was matched
by the n
th group \(...\)
of the regular expression during
the latest string_match
, search_forward
or search_backward
.
The user must make sure that the parameter s
is the same string
that was passed to the matching or searching function.
matched_group n s
raises Not_found
if the n
th group
of the regular expression was not matched. This can happen
with groups inside alternatives \|
, options ?
or repetitions *
. For instance, the empty string will match
\(a\)*
, but matched_group 1 ""
will raise Not_found
because the first group itself was not matched.
group_beginning
: int -> int |
group_end
: int -> int |
group_beginning n
returns the position of the first character
of the substring that was matched by the n
th group of
the regular expression. group_end n
returns
the position of the character following the last character of
the matched substring. Both functions raise Not_found
if the n
th group of the regular expression
was not matched.
global_replace
: pat:regexp -> templ:string -> string -> string |
global_replace regexp templ s
returns a string identical to s
,
except that all substrings of s
that match regexp
have been
replaced by templ
. The replacement template templ
can contain
\1
, \2
, etc; these sequences will be replaced by the text
matched by the corresponding group in the regular expression.
\0
stands for the text matched by the whole regular expression.
replace_first
: pat:regexp -> templ:string -> string -> string |
global_replace
, except that only the first substring
matching the regular expression is replaced.
global_substitute
: pat:regexp -> subst:(string -> string) -> string -> string |
global_substitute regexp subst s
returns a string identical
to s
, except that all substrings of s
that match regexp
have been replaced by the result of function subst
. The
function subst
is called once for each matching substring,
and receives s
(the whole text) as argument.
substitute_first
: pat:regexp -> subst:(string -> string) -> string -> string |
global_substitute
, except that only the first substring
matching the regular expression is replaced.
replace_matched
: templ:string -> string -> string |
replace_matched repl s
returns the replacement text repl
in which \1
, \2
, etc. have been replaced by the text
matched by the corresponding groups in the most recent matching
operation. s
must be the same string that was matched during
this matching operation.
split
: sep:regexp -> string -> string list |
split r s
splits s
into substrings, taking as delimiters
the substrings that match r
, and returns the list of substrings.
For instance, split (regexp "[ \t]+") s
splits s
into
blank-separated words. An occurrence of the delimiter at the
beginning and at the end of the string is ignored.
bounded_split
: sep:regexp -> string -> max:int -> string list |
split
, but splits into at most n
substrings,
where n
is the extra integer parameter.
split_delim
: sep:regexp -> string -> string list |
bounded_split_delim
: sep:regexp -> string -> max:int -> string list |
split
and bounded_split
, but occurrences of the
delimiter at the beginning and at the end of the string are
recognized and returned as empty strings in the result.
For instance, split_delim (regexp " ") " abc "
returns [""; "abc"; ""]
, while split
with the same
arguments returns ["abc"]
.
full_split
: sep:regexp -> string -> split_result list |
bounded_full_split
: sep:regexp -> string -> int -> split_result list |
split_delim
and bounded_split_delim
, but returns
the delimiters as well as the substrings contained between
delimiters. The former are tagged Delim
in the result list;
the latter are tagged Text
. For instance,
full_split (regexp "[{}]") "{ab}"
returns
[Delim "{"; Text "ab"; Delim "}"]
.
string_before
: string -> int -> string |
string_before s n
returns the substring of all characters of s
that precede position n
(excluding the character at
position n
).
string_after
: string -> int -> string |
string_after s n
returns the substring of all characters of s
that follow position n
(including the character at
position n
).
first_chars
: string -> len:int -> string |
first_chars s n
returns the first n
characters of s
.
This is the same function as string_before
.
last_chars
: string -> len:int -> string |
last_chars s n
returns the last n
characters of s
.