Compiling SED-like command in DCG phrase.

classic Classic list List threaded Threaded
1 message Options
Reply | Threaded
Open this post in threaded view
|  
Report Content as Inappropriate

Compiling SED-like command in DCG phrase.

Kuniaki Mukai

Hi,

I have added a sed-like command to DCG phrase based on
the regular expression compiler,  which I have posted
in my previous message.  

The following are examples to show quickly what the sed command in DCG  is.

The first example removes all occurrences alphabet letters.
% ?- phrase(sed(wl("[a-zA-Z]"), =([])), `1a2b3c4d56e`,  V), string_codes(S, V).
%@ S = "123456"

The second one swaps adjacent pairs of letters.
% ?- phrase(sed((w(".", A), w(".", B)),  append(B, A)), `abcdefg`,  V), string_codes(S, V).
%@ S = "badcfeg" .


More generally, a sed expression sed(<DCG Phrase>, <pred/1>) replaces the returned value
from  calling <pred/1> for the matched part of input codes.

The following clause expand_sed/6 is almost only what I have added to
my  "PAC" library for the "sed".

expand_sed(Words,  Func_in_pac,  Mod,  G_callable, List, List4):-
        pac:expand_phrase(Words, Mod, Phrase, List, List0),
        pac:expand_arg(Func_in_pac, Mod, Func_callable, List0, List1),
        pac:phrase_to_pred(Phrase, Mod, [P, P0]:- Expanded_phrase, List1, List2),
        pac:expand_core(
                rec(Sed_name, [], ( [L0, L1, P, Q]  :-  Expanded_phrase,  !,
                                                call(Func_callable, Val),
                                                append(Val, L2, L0),
                                                call(Sed_name, L2, L1, P0, Q))
                        &    ( [[C|Z0], Z, [C|P], Q]:- call(Sed_name, Z0, Z, P, Q))
                        &    [R, R, [], []] ),
                          Mod, Sed_main, List2, List3),
        pac:expand_core(pred([U,V]:- call(Sed_main, V, [], U, [])),  Mod, G_callable, List3, List4).

In this clause, expand_phrase, expand_arg, phrase_to_pred, and expand_core are
basic tools in my private library PAC, and  rec(F, V, P) is a PAC
expression introduced for recursive anonymous predicates like
the named recursive predicate append/3.

The folowing DCG rule removes all TeX control sequence occureences from a TeX text codes.

sed_for_detex --> sed(wl(   "\\\\[a-zA-Z]+"
                      | "\\\\[\\!\\\"\\#\\$\\'\\(\\)\\=\\-\\~\\^\\\\\\|\\`\\@\\{\\[\\}\\]\\*\\:\\+\\;\\<\\>\\,\\.]"),
              pred([[]])).

where w/wl means "shortest/longest match first" searching modes, respectively.


% ?- sed_for_detex(`\\Large Hello\\; World! `, X), string_codes(Y, X).
%@ Y = " Hello World! " .

Thank you  in advance for pointing to related works.

Documentation is not available, sorry. My documentation speed is exceptionally slow
like turtles.

Regards,

Kuniaki

-------------- next part --------------
A non-text attachment was scrubbed...
Name: signature.asc
Type: application/pgp-signature
Size: 496 bytes
Desc: Message signed with OpenPGP using GPGMail
URL: <https://lists.iai.uni-bonn.de/pipermail/swi-prolog/attachments/20140909/b413f224/signature.asc>
_______________________________________________
SWI-Prolog mailing list
[hidden email]
https://lists.iai.uni-bonn.de/mailman/listinfo.cgi/swi-prolog
Loading...