On Wednesday, 2 August 2017 13:52:06 CEST Richard W.M. Jones wrote:
On Wed, Aug 02, 2017 at 12:33:14PM +0200, Pino Toscano wrote:
> Hi,
>
> (replying here since v2 of the series does not have this explanation.)
>
> On Tuesday, 1 August 2017 16:00:15 CEST Richard W.M. Jones wrote:
> > We'd like to use PCRE instead of the awful Str module. However I
> > don't necessarily want to pull in the extra dependency of ocaml-pcre,
> > and in any case ocaml-pcre is rather difficult to use.
> >
> > This introduces very simplified and lightweight bindings for PCRE.
> >
> > They work rather like Str in that there is some global state (actually
> > thread-local in this implementation) between the matching and the
> > getting the substring, so you can write code like this:
> >
> > let re = PCRE.compile "(a+)b"
> > ...
> >
> > if PCRE.matches re "ccaaaabb" then (
> > let whole = PCRE.sub 0 in (* returns "aaaab" *)
> > let first = PCRE.sub 1 in (* returns "aaaa" *)
> > ...
>
> Since we are providing a better module, with a different API (which
> needs changes), what about removing the usage of a global state, in
> favour of a match object holding the captures? Something like
> (starting from your example above):
>
> let re = PCRE.compile "(a+)b" in
> try
> let m = PCRE.match re "ccaaaabb" in
> let whole = PCRE.sub m 0 in (* returns "aaaab" *)
> let first = PCRE.sub m 1 in (* returns "aaaa" *)
> with Not_matched _ ->
> ...
That's what I was trying to avoid. I think the if statement with
global state is much easier to use.
> This makes it possible to stop thinking about what was the last saved
> state, and even keep the multiple results of matches at the same time.
I've converted all of the daemon code to this form, and this is
not an issue that came up.
Right, because we have already these constraints because of Str.
> Also the results are properly GC'ed once they get out of scope, and not
> linger until the thread finish (or the program shutdown).
> The drawback I see is that many of the Str usages are in chains of
> "if ... else if ...", which could make the code slightly more complex.
>
> Of course PCRE.matches ought to be left, but it would just return
> whether the re matched, without changing any global state, and without
> any result available.
I think you're suggesting this:
let m = PCRE.exec re "ccaaaabb" in
if PCRE.matches m then (
let whole = PCRE.sub m 0 in
Not really, my suggestion was to have a separate object representing
the result of a regex match -- much like other language have in their
regex APIs.
OTOH, this solution LGTM as well: the result of the regex is not saved
in a thread-local variable, but directly in the same regex object, so
can be kept/used around, and it is GC'ed when not needed anymore.
If you could apply that change, that'd be a LGTM.
Thanks,
--
Pino Toscano