2 * Introduction to String Processing::
3 * String Input and Output::
6 * Octets and Utilities for Cryptography::
9 @c -----------------------------------------------------------------------------
10 @c -----------------------------------------------------------------------------
11 @node Introduction to String Processing, String Input and Output, Package stringproc, Package stringproc
12 @section Introduction to String Processing
14 The package @code{stringproc} contains functions for processing strings
15 and characters including formatting, encoding and data streams.
16 This package is completed by some tools for cryptography, e.g. base64 and hash
19 It can be directly loaded via @code{load("stringproc")} or automatically by
20 using one of its functions.
22 For questions and bug reports please contact the author. The following
23 command prints his e-mail-address.
25 @code{printf(true, "~@{~a~@}@@gmail.com", split(sdowncase("Volker van Nek")))$}
28 A string is constructed by typing e.g. @code{"Text"}.
29 When the option variable @mref{stringdisp} is set to @code{false}, which is
30 the default, the double quotes won't be printed.
31 @ref{stringp} is a test, if an object is a string.
40 Characters are represented by a string of length 1.
41 @ref{charp} is the corresponding test.
50 In Maxima position indices in strings are like in list 1-indexed
51 which results to the following consistency.
54 (%i1) is(charat("Lisp",1) = charlist("Lisp")[1]);
58 A string may contain Maxima expressions.
59 These can be parsed with @ref{parse_string}.
62 (%i1) map(parse_string, ["42" ,"sqrt(2)", "%pi"]);
63 (%o1) [42, sqrt(2), %pi]
65 (%o2) [42.0, 1.414213562373095, 3.141592653589793]
68 Strings can be processed as characters or in binary form as octets.
69 Functions for conversions are @ref{string_to_octets} and @ref{octets_to_string}.
70 Usable encodings depend on the platform, the application and the
72 (The following shows Maxima in GNU/Linux, compiled with SBCL.)
76 (%i2) string_to_octets("$@pounds{}@euro{}", "cp1252");
78 (%i3) string_to_octets("$@pounds{}@euro{}", "utf-8");
79 (%o3) [24, 0C2, 0A3, 0E2, 82, 0AC]
82 Strings may be written to character streams or as octets to binary streams.
83 The following example demonstrates file in and output of characters.
85 @ref{openw} returns an output stream to a file,
86 @ref{printf} writes formatted to that file and by e.g.
87 @ref{close} all characters contained in the stream are written to the file.
90 (%i1) s: openw("file.txt");
91 (%o1) #<output stream file.txt>
92 (%i2) printf(s, "~%~d ~f ~a ~a ~f ~e ~a~%",
93 42, 1.234, sqrt(2), %pi, 1.0e-2, 1.0e-2, 1.0b-2)$
97 @ref{openr} then returns an input stream from the previously used file and
98 @ref{readline} returns the line read as a string.
99 The string may be tokenized by e.g. @ref{split} or @ref{tokens} and
100 finally parsed by @ref{parse_string}.
103 (%i4) s: openr("file.txt");
104 (%o4) #<input stream file.txt>
106 (%o5) 42 1.234 sqrt(2) %pi 0.01 1.0E-2 1.0b-2
107 (%i6) map(parse_string, split(%));
108 (%o6) [42, 1.234, sqrt(2), %pi, 0.01, 0.01, 1.0b-2]
112 @opencatbox{Categories:}
114 @category{Share packages}
115 @category{Package stringproc}
119 @c -----------------------------------------------------------------------------
120 @c -----------------------------------------------------------------------------
121 @node String Input and Output, Characters, Introduction to String Processing, Package stringproc
122 @section String Input and Output
124 Example: Formatted printing to a file.
127 (%i1) s: openw("file.txt");
128 (%o1) #<output stream file.txt>
130 "~2tAn atom: ~20t~a~%~2tand a list: ~20t~@{~r ~@}~%~2t\
131 and an integer: ~20t~d~%"$
132 (%i3) printf( s,control, 'true,[1,2,3],42 )$
136 (%i5) s: openr("file.txt");
137 (%o5) #<input stream file.txt>
138 (%i6) while stringp( tmp:readline(s) ) do print(tmp)$
140 and a list: one two three
145 @c -----------------------------------------------------------------------------
147 @deffn {Function} close (@var{stream})
149 Closes @var{stream} and returns @code{true} if @var{stream} had been open.
151 @opencatbox{Categories:}
152 @category{File input}
153 @category{File output}
154 @category{Package stringproc}
159 @c -----------------------------------------------------------------------------
161 @deffn {Function} flength (@var{stream})
163 @var{stream} has to be an open stream from or to a file.
164 @code{flength} then returns the number of bytes which are currently present in this file.
166 Example: See @ref{writebyte} .
168 @opencatbox{Categories:}
169 @category{File input}
170 @category{File output}
171 @category{Package stringproc}
176 @c -----------------------------------------------------------------------------
177 @anchor{flush_output}
178 @deffn {Function} flush_output (@var{stream})
180 Flushes @var{stream} where @var{stream} has to be an output stream to a file.
182 Example: See @ref{writebyte} .
184 @opencatbox{Categories:}
185 @category{File output}
186 @category{Package stringproc}
191 @c -----------------------------------------------------------------------------
193 @deffn {Function} fposition @
194 @fname{fposition} (@var{stream}) @
195 @fname{fposition} (@var{stream}, @var{pos})
197 Returns the current position in @var{stream}, if @var{pos} is not used.
198 If @var{pos} is used, @code{fposition} sets the position in @var{stream}.
199 @var{stream} has to be a stream from or to a file and
200 @var{pos} has to be a positive number.
202 Positions in data streams are like in strings or lists 1-indexed,
203 i.e. the first element in @var{stream} is in position 1.
205 @opencatbox{Categories:}
206 @category{File input}
207 @category{File output}
208 @category{Package stringproc}
213 @c -----------------------------------------------------------------------------
215 @deffn {Function} freshline @
216 @fname{freshline} () @
217 @fname{freshline} (@var{stream})
219 Writes a new line to the standard output stream
220 if the position is not at the beginning of a line and returns @code{true}.
221 Using the optional argument @var{stream} the new line is written to that stream.
222 There are some cases, where @code{freshline()} does not work as expected.
224 See also @ref{newline}.
226 @opencatbox{Categories:}
227 @category{File output}
228 @category{Package stringproc}
233 @c -----------------------------------------------------------------------------
234 @anchor{get_output_stream_string}
235 @deffn {Function} get_output_stream_string (@var{stream})
237 Returns a string containing all the characters currently present in
238 @var{stream} which must be an open string-output stream.
239 The returned characters are removed from @var{stream}.
241 Example: See @ref{make_string_output_stream} .
243 @opencatbox{Categories:}
244 @category{Package stringproc}
249 @c -----------------------------------------------------------------------------
250 @anchor{make_string_input_stream}
251 @deffn {Function} make_string_input_stream @
252 @fname{make_string_input_stream} (@var{string}) @
253 @fname{make_string_input_stream} (@var{string}, @var{start}) @
254 @fname{make_string_input_stream} (@var{string}, @var{start}, @var{end})
256 Returns an input stream which contains parts of @var{string} and an end of file.
257 Without optional arguments the stream contains the entire string
258 and is positioned in front of the first character.
259 @var{start} and @var{end} define the substring contained in the stream.
260 The first character is available at position 1.
263 (%i1) istream : make_string_input_stream("text", 1, 4);
264 (%o1) #<string-input stream from "text">
265 (%i2) (while (c : readchar(istream)) # false do sprint(c), newline())$
267 (%i3) close(istream)$
270 @opencatbox{Categories:}
271 @category{Package stringproc}
276 @c -----------------------------------------------------------------------------
277 @anchor{make_string_output_stream}
278 @deffn {Function} make_string_output_stream ()
280 Returns an output stream that accepts characters. Characters currently present
281 in this stream can be retrieved by @ref{get_output_stream_string}.
284 (%i1) ostream : make_string_output_stream();
285 (%o1) #<string-output stream 09622ea0>
286 (%i2) printf(ostream, "foo")$
288 (%i3) printf(ostream, "bar")$
290 (%i4) string : get_output_stream_string(ostream);
292 (%i5) printf(ostream, "baz")$
294 (%i6) string : get_output_stream_string(ostream);
296 (%i7) close(ostream)$
299 @opencatbox{Categories:}
300 @category{Package stringproc}
305 @c -----------------------------------------------------------------------------
307 @deffn {Function} newline @
309 @fname{newline} (@var{stream})
311 Writes a new line to the standard output stream.
312 Using the optional argument @var{stream} the new line is written to that stream.
313 There are some cases, where @code{newline()} does not work as expected.
315 See @ref{sprint} for an example of using @code{newline()}.
317 @opencatbox{Categories:}
318 @category{File output}
319 @category{Package stringproc}
324 @c -----------------------------------------------------------------------------
326 @deffn {Function} opena (@var{file})
328 Returns a character output stream to @var{file}.
329 If an existing file is opened, @code{opena} appends elements at the end of @var{file}.
331 For binary output see @ref{Functions and Variables for binary input and output, , opena_binary} .
333 @opencatbox{Categories:}
334 @category{File output}
335 @category{Package stringproc}
340 @c -----------------------------------------------------------------------------
342 @deffn {Function} openr @
343 @fname{openr} (@var{file}) @
344 @fname{openr} (@var{file}, @var{encoding})
347 Returns a character input stream to @var{file}.
348 @code{openr} assumes that @var{file} already exists.
349 If reading the file results in a lisp error about its encoding
350 passing the correct string as the argument @var{encoding} might help.
351 The available encodings and their names depend on the lisp being used.
352 For sbcl a list of suitable strings can be found at
353 @url{http://www.sbcl.org/manual/#External-Formats}.
355 For binary input see @ref{Functions and Variables for binary input and output, , openr_binary} .
356 See also @mref{close} and @mrefdot{openw}
359 (%i1) istream : openr("data.txt","EUC-JP");
360 (%o1) #<FD-STREAM for "file /home/gunter/data.txt" @{10099A3AE3@}>
361 (%i2) close(istream);
366 @opencatbox{Categories:}
367 @category{File input}
368 @category{Package stringproc}
373 @c -----------------------------------------------------------------------------
375 @deffn {Function} openw (@var{file})
377 Returns a character output stream to @var{file}.
378 If @var{file} does not exist, it will be created.
379 If an existing file is opened, @code{openw} destructively modifies @var{file}.
381 For binary output see @ref{Functions and Variables for binary input and output, , openw_binary} .
383 See also @mref{close} and @mrefdot{openr}
385 @opencatbox{Categories:}
386 @category{File output}
387 @category{Package stringproc}
392 @c -----------------------------------------------------------------------------
394 @deffn {Function} printf @
395 @fname{printf} (@var{dest}, @var{string}) @
396 @fname{printf} (@var{dest}, @var{string}, @var{expr_1}, ..., @var{expr_n})
398 Produces formatted output by outputting the characters of control-string
399 @var{string} and observing that a tilde introduces a directive.
400 The character after the tilde, possibly preceded by prefix parameters
401 and modifiers, specifies what kind of formatting is desired.
402 Most directives use one or more elements of the arguments
403 @var{expr_1}, ..., @var{expr_n} to create their output.
405 If @var{dest} is a stream or @code{true}, then @code{printf} returns @code{false}.
406 Otherwise, @code{printf} returns a string containing the output.
407 By default the streams @var{stdin}, @var{stdout} and @var{stderr} are defined.
408 If Maxima is running as a network client (which is the normal case if Maxima is communicating
409 with a graphical user interface, which must be the server) @code{setup-client}
410 will define @var{old_stdout} and @var{old_stderr}, too.
412 @code{printf} provides the Common Lisp function @code{format} in Maxima.
413 The following example illustrates the general relation between these two
417 (%i1) printf(true, "R~dD~d~%", 2, 2);
420 (%i2) :lisp (format t "R~dD~d~%" 2 2)
425 The following description is limited to a rough sketch of the possibilities of
427 The Lisp function @code{format} is described in detail in many reference books.
428 Of good help is e.g. the free available online-manual
429 "Common Lisp the Language" by Guy L. Steele. See chapter 22.3.3 there.
431 In addition, @code{printf} recognizes two format directives which are not known to Lisp @code{format}.
432 The format directive @code{~m} indicates Maxima pretty printer output.
433 The format directive @code{~h} indicates a bigfloat number.
443 ~x hexadecimal integer
448 ~e scientific notation
449 ~g ~f or ~e, depending upon magnitude
451 ~a uses Maxima function string
452 ~m Maxima pretty printer output
453 ~s like ~a, but output enclosed in "double quotes"
455 ~< justification, ~> terminates
456 ~( case conversion, ~) terminates
457 ~[ selection, ~] terminates
458 ~@{ iteration, ~@} terminates
461 Note that the directive ~* is not supported.
463 If @var{dest} is a stream or @code{true}, then @code{printf} returns @code{false}.
464 Otherwise, @code{printf} returns a string containing the output.
467 (%i1) printf( false, "~a ~a ~4f ~a ~@@r",
468 "String",sym,bound,sqrt(12),144), bound = 1.234;
469 (%o1) String sym 1.23 2*sqrt(3) CXLIV
470 (%i2) printf( false,"~@{~a ~@}",["one",2,"THREE"] );
472 (%i3) printf(true,"~@{~@{~9,1f ~@}~%~@}",mat ),
473 mat = args(matrix([1.1,2,3.33],[4,5,6],[7,8.88,9]))$
477 (%i4) control: "~:(~r~) bird~p ~[is~;are~] singing."$
478 (%i5) printf( false,control, n,n,if n=1 then 1 else 2 ), n=2;
479 (%o5) Two birds are singing.
482 The directive ~h has been introduced to handle bigfloats.
487 d : decimal digits behind floating point
488 e : minimal exponent digits
489 x : preferred exponent
490 o : overflow character
491 p : padding character
492 @@ : display sign for positive numbers
497 (%i2) printf(true, "|~h|~%", 2.b0^-64)$
498 |0.0000000000000000000542101086242752217003726400434970855712890625|
500 (%i4) printf(true, "|~h|~%", sqrt(2))$
501 |1.4142135623730950488016887|
503 (%i6) printf(true, "|~h|~%", sqrt(2))$
504 |1.41421356237309504880169|
505 (%i7) printf(true, "|~28h|~%", sqrt(2))$
506 | 1.41421356237309504880169|
507 (%i8) printf(true, "|~28,,,,,'*h|~%", sqrt(2))$
508 |***1.41421356237309504880169|
509 (%i9) printf(true, "|~,18h|~%", sqrt(2))$
510 |1.414213562373095049|
511 (%i10) printf(true, "|~,,,-3h|~%", sqrt(2))$
512 |1414.21356237309504880169b-3|
513 (%i11) printf(true, "|~,,2,-3h|~%", sqrt(2))$
514 |1414.21356237309504880169b-03|
515 (%i12) printf(true, "|~20h|~%", sqrt(2))$
516 |1.41421356237309504880169|
517 (%i13) printf(true, "|~20,,,,'+h|~%", sqrt(2))$
518 |++++++++++++++++++++|
521 For conversion of objects to strings also see @mrefcomma{concat} @mrefcomma{sconcat}
522 @mref{string} and @mrefdot{simplode}
524 @opencatbox{Categories:}
525 @category{File output}
526 @category{Package stringproc}
531 @c -----------------------------------------------------------------------------
533 @deffn {Function} readbyte (@var{stream})
535 Removes and returns the first byte in @var{stream} which must be a binary input stream.
536 If the end of file is encountered @code{readbyte} returns @code{false}.
538 Example: Read the first 16 bytes from a file encrypted with AES in OpenSSL.
541 (%i1) ibase: obase: 16.$
543 (%i2) in: openr_binary("msg.bin");
544 (%o2) #<input stream msg.bin>
545 (%i3) (L:[], thru 16. do push(readbyte(in), L), L:reverse(L));
546 (%o3) [53, 61, 6C, 74, 65, 64, 5F, 5F, 88, 56, 0DE, 8A, 74, 0FD,
550 (%i5) map(ascii, rest(L,-8));
551 (%o5) [S, a, l, t, e, d, _, _]
552 (%i6) salt: octets_to_number(rest(L,8));
553 (%o6) 8856de8a74fdadf0
556 @opencatbox{Categories:}
557 @category{File input}
558 @category{Package stringproc}
563 @c -----------------------------------------------------------------------------
565 @deffn {Function} readchar (@var{stream})
567 Removes and returns the first character in @var{stream}.
568 If the end of file is encountered @code{readchar} returns @code{false}.
570 Example: See @ref{make_string_input_stream}.
572 @opencatbox{Categories:}
573 @category{File input}
574 @category{Package stringproc}
579 @c -----------------------------------------------------------------------------
581 @deffn {Function} readline (@var{stream})
583 Returns a string containing all characters starting at the current position
584 in @var{stream} up to the end of the line or @code{false}
585 if the end of the file is encountered.
587 @opencatbox{Categories:}
588 @category{File input}
589 @category{Package stringproc}
594 @c -----------------------------------------------------------------------------
596 @deffn {Function} sprint (@var{expr_1}, @dots{}, @var{expr_n})
598 Evaluates and displays its arguments one after the other `on a line' starting at
599 the leftmost position. The expressions are printed with a space character right next
600 to the number, and it disregards line length.
601 @code{newline()} might be used for line breaking.
603 Example: Sequential printing with @code{sprint}.
604 Creating a new line with @code{newline()}.
607 (%i1) for n:0 thru 19 do sprint(fib(n))$
608 0 1 1 2 3 5 8 13 21 34 55 89 144 233 377 610 987 1597 2584 4181
609 (%i2) for n:0 thru 22 do (
611 if mod(n,10) = 9 then newline() )$
612 0 1 1 2 3 5 8 13 21 34
613 55 89 144 233 377 610 987 1597 2584 4181
617 @opencatbox{Categories:}
618 @category{Package stringproc}
622 @c -----------------------------------------------------------------------------
624 @deffn {Function} writebyte (@var{byte}, @var{stream})
626 Writes @var{byte} to @var{stream} which must be a binary output stream.
627 @code{writebyte} returns @code{byte}.
629 Example: Write some bytes to a binary file output stream.
630 In this example all bytes correspond to printable characters and are printed
632 The bytes remain in the stream until @code{flush_output} or @code{close} have been called.
635 (%i1) ibase: obase: 16.$
637 (%i2) bytes: map(cint, charlist("GNU/Linux"));
638 (%o2) [47, 4E, 55, 2F, 4C, 69, 6E, 75, 78]
639 (%i3) out: openw_binary("test.bin");
640 (%o3) #<output stream test.bin>
641 (%i4) for i thru 3 do writebyte(bytes[i], out);
643 (%i5) printfile("test.bin")$
647 (%i7) flush_output(out);
651 (%i9) printfile("test.bin")$
653 (%i0A) for b in rest(bytes,3) do writebyte(b, out);
657 (%i0C) printfile("test.bin")$
661 @opencatbox{Categories:}
662 @category{File output}
663 @category{Package stringproc}
668 @c -----------------------------------------------------------------------------
669 @c -----------------------------------------------------------------------------
670 @node Characters, String Processing, String Input and Output, Package stringproc
673 Characters are strings of length 1.
675 @c -----------------------------------------------------------------------------
676 @anchor{adjust_external_format}
677 @deffn {Function} adjust_external_format ()
679 Prints information about the current external format of the Lisp reader
680 and in case the external format encoding differs from the encoding of the
681 application which runs Maxima @code{adjust_external_format} tries to adjust
682 the encoding or prints some help or instruction.
683 @code{adjust_external_format} returns @code{true} when the external format has
684 been changed and @code{false} otherwise.
686 Functions like @ref{cint}, @ref{unicode}, @ref{octets_to_string}
687 and @ref{string_to_octets} need UTF-8 as the external format of the
688 Lisp reader to work properly over the full range of Unicode characters.
690 Examples (Maxima on Windows, March 2016):
691 Using @code{adjust_external_format} when the default external format
692 is not equal to the encoding provided by the application.
694 1. Command line Maxima
696 In case a terminal session is preferred it is recommended to use Maxima compiled
697 with SBCL. Here Unicode support is provided by default and calls to
698 @code{adjust_external_format} are unnecessary.
700 If Maxima is compiled with CLISP or GCL it is recommended to change
701 the terminal encoding from CP850 to CP1252.
702 @code{adjust_external_format} prints some help.
704 CCL reads UTF-8 while the terminal input is CP850 by default.
705 CP1252 is not supported by CCL. @code{adjust_external_format}
706 prints instructions for changing the terminal encoding and external format
711 In wxMaxima SBCL reads CP1252 by default but the input from the application
712 is UTF-8 encoded. Adjustment is needed.
714 Calling @code{adjust_external_format} and restarting Maxima
715 permanently changes the default external format to UTF-8.
718 (%i1)adjust_external_format();
720 (setf sb-impl::*default-external-format* :utf-8)
721 has been appended to the init file
722 C:/Users/Username/.sbclrc
723 Please restart Maxima to set the external format to UTF-8.
730 (%i1) adjust_external_format();
731 The external format is currently UTF-8
732 and has not been changed.
736 @opencatbox{Categories:}
737 @category{Package stringproc}
742 @c -----------------------------------------------------------------------------
744 @deffn {Function} alphacharp (@var{char})
746 Returns @code{true} if @var{char} is an alphabetic character.
748 To identify a non-US-ASCII character as an alphabetic character
749 the underlying Lisp must provide full Unicode support.
750 E.g. a German umlaut is detected as an alphabetic character with SBCL in GNU/Linux
752 (In Windows Maxima, when compiled with SBCL, must be set to UTF-8.
753 See @ref{adjust_external_format} for more.)
755 Example: Examination of non-US-ASCII characters.
757 The underlying Lisp (SBCL, GNU/Linux) is able to convert the typed character
758 into a Lisp character and to examine.
761 (%i1) alphacharp("@"u");
765 In GCL this is not possible. An error break occurs.
768 (%i1) alphacharp("u");
770 (%i2) alphacharp("@"u");
772 package stringproc: @"u cannot be converted into a Lisp character.
776 @opencatbox{Categories:}
777 @category{Predicate functions}
778 @category{Package stringproc}
783 @c -----------------------------------------------------------------------------
784 @anchor{alphanumericp}
785 @deffn {Function} alphanumericp (@var{char})
787 Returns @code{true} if @var{char} is an alphabetic character or a digit
788 (only corresponding US-ASCII characters are regarded as digits).
790 Note: See remarks on @ref{alphacharp}.
792 @opencatbox{Categories:}
793 @category{Predicate functions}
794 @category{Package stringproc}
799 @c -----------------------------------------------------------------------------
801 @deffn {Function} ascii (@var{int})
803 Returns the US-ASCII character corresponding to the integer @var{int}
804 which has to be less than @code{128}.
806 See @ref{unicode} for converting code points larger than @code{127}.
811 (%i1) for n from 0 thru 127 do (
813 if alphacharp(ch) then sprint(ch),
814 if n = 96 then newline() )$
815 A B C D E F G H I J K L M N O P Q R S T U V W X Y Z
816 a b c d e f g h i j k l m n o p q r s t u v w x y z
819 @opencatbox{Categories:}
820 @category{Package stringproc}
825 @c -----------------------------------------------------------------------------
827 @deffn {Function} cequal (@var{char_1}, @var{char_2})
829 Returns @code{true} if @var{char_1} and @var{char_2} are the same character.
831 @opencatbox{Categories:}
832 @category{Predicate functions}
833 @category{Package stringproc}
838 @c -----------------------------------------------------------------------------
839 @anchor{cequalignore}
840 @deffn {Function} cequalignore (@var{char_1}, @var{char_2})
842 Like @code{cequal} but ignores case which is only possible for non-US-ASCII
843 characters when the underlying Lisp is able to recognize a character as an
844 alphabetic character. See remarks on @ref{alphacharp}.
846 @opencatbox{Categories:}
847 @category{Predicate functions}
848 @category{Package stringproc}
853 @c -----------------------------------------------------------------------------
855 @deffn {Function} cgreaterp (@var{char_1}, @var{char_2})
857 Returns @code{true} if the code point of @var{char_1} is greater than the
858 code point of @var{char_2}.
860 @opencatbox{Categories:}
861 @category{Predicate functions}
862 @category{Package stringproc}
867 @c -----------------------------------------------------------------------------
868 @anchor{cgreaterpignore}
869 @deffn {Function} cgreaterpignore (@var{char_1}, @var{char_2})
871 Like @code{cgreaterp} but ignores case which is only possible for non-US-ASCII
872 characters when the underlying Lisp is able to recognize a character as an
873 alphabetic character. See remarks on @ref{alphacharp}.
875 @opencatbox{Categories:}
876 @category{Predicate functions}
877 @category{Package stringproc}
882 @c -----------------------------------------------------------------------------
884 @deffn {Function} charp (@var{obj})
886 Returns @code{true} if @var{obj} is a Maxima-character.
887 See introduction for example.
889 @opencatbox{Categories:}
890 @category{Predicate functions}
891 @category{Package stringproc}
896 @c -----------------------------------------------------------------------------
898 @deffn {Function} cint (@var{char})
900 Returns the Unicode code point of @var{char} which must be a
901 Maxima character, i.e. a string of length @code{1}.
903 Examples: The hexadecimal code point of some characters
904 (Maxima with SBCL on GNU/Linux).
908 (%i2) map(cint, ["$","@pounds{}","@euro{}"]);
909 (%o2) [24, 0A3, 20AC]
912 Warning: It is not possible to enter characters corresponding to code points
913 larger than 16 bit in wxMaxima with SBCL on Windows when the external format
914 has not been set to UTF-8. See @ref{adjust_external_format}.
916 @c Command @U not supported by texinfo 5.
918 @c (%i3) cint("@U{1d538}");
922 CMUCL doesn't process these characters as one character.
923 @code{cint} then returns @code{false}.
924 @c Converting to UTF-8-octets and finally to Unicode serves as a workaround.
925 Converting a character to a code point via UTF-8-octets may serve as a workaround:
927 @code{utf8_to_unicode(string_to_octets(character));}
929 @c Command @U not supported by texinfo 5.
931 @c (%i4) utf8_to_unicode(string_to_octets("@U{1d538}"));
935 See @ref{utf8_to_unicode}, @ref{string_to_octets}.
937 @opencatbox{Categories:}
938 @category{Package stringproc}
943 @c -----------------------------------------------------------------------------
945 @deffn {Function} clessp (@var{char_1}, @var{char_2})
947 Returns @code{true} if the code point of @var{char_1} is less than the
948 code point of @var{char_2}.
950 @opencatbox{Categories:}
951 @category{Predicate functions}
952 @category{Package stringproc}
957 @c -----------------------------------------------------------------------------
958 @anchor{clesspignore}
959 @deffn {Function} clesspignore (@var{char_1}, @var{char_2})
961 Like @code{clessp} but ignores case which is only possible for non-US-ASCII
962 characters when the underlying Lisp is able to recognize a character as an
963 alphabetic character. See remarks on @ref{alphacharp}.
965 @opencatbox{Categories:}
966 @category{Predicate functions}
967 @category{Package stringproc}
972 @c -----------------------------------------------------------------------------
974 @deffn {Function} constituent (@var{char})
976 Returns @code{true} if @var{char} is a graphic character but not a space character.
977 A graphic character is a character one can see, plus the space character.
978 (@code{constituent} is defined by Paul Graham.
979 See Paul Graham, ANSI Common Lisp, 1996, page 67.)
982 (%i1) for n from 0 thru 255 do (
983 tmp: ascii(n), if constituent(tmp) then sprint(tmp) )$
984 ! " # % ' ( ) * + , - . / 0 1 2 3 4 5 6 7 8 9 : ; < = > ? @@ A B
985 C D E F G H I J K L M N O P Q R S T U V W X Y Z [ \ ] ^ _ ` a b c
986 d e f g h i j k l m n o p q r s t u v w x y z @{ | @} ~
989 @opencatbox{Categories:}
990 @category{Predicate functions}
991 @category{Package stringproc}
996 @c -----------------------------------------------------------------------------
997 @c @deffn {Function} cunlisp (@var{lisp_char})
998 @c Converts a Lisp-character into a Maxima-character.
999 @c (You won't need it.)
1001 @c @opencatbox{Categories:}
1002 @c @category{Package stringproc}
1007 @c -----------------------------------------------------------------------------
1009 @deffn {Function} digitcharp (@var{char})
1011 Returns @code{true} if @var{char} is a digit where only the corresponding
1012 US-ASCII-character is regarded as a digit.
1014 @opencatbox{Categories:}
1015 @category{Predicate functions}
1016 @category{Package stringproc}
1021 @c -----------------------------------------------------------------------------
1022 @c @deffn {Function} lcharp (@var{obj})
1023 @c Returns @code{true} if @var{obj} is a Lisp-character.
1024 @c (You won't need it.)
1026 @c @opencatbox{Categories:}
1027 @c @category{Predicate functions}
1028 @c @category{Package stringproc}
1033 @c -----------------------------------------------------------------------------
1035 @deffn {Function} lowercasep (@var{char})
1037 Returns @code{true} if @var{char} is a lowercase character.
1039 Note: See remarks on @ref{alphacharp}.
1041 @opencatbox{Categories:}
1042 @category{Predicate functions}
1043 @category{Package stringproc}
1048 @c -----------------------------------------------------------------------------
1049 @anchor{newline_variable}
1050 @defvr {Variable} newline
1052 The newline character (ASCII-character 10).
1054 @opencatbox{Categories:}
1055 @category{Global variables}
1056 @category{Package stringproc}
1061 @c -----------------------------------------------------------------------------
1062 @anchor{space_variable}
1063 @defvr {Variable} space
1065 The space character.
1067 @opencatbox{Categories:}
1068 @category{Global variables}
1069 @category{Package stringproc}
1074 @c -----------------------------------------------------------------------------
1075 @anchor{tab_variable}
1076 @defvr {Variable} tab
1080 @opencatbox{Categories:}
1081 @category{Global variables}
1082 @category{Package stringproc}
1087 @c -----------------------------------------------------------------------------
1089 @deffn {Function} unicode (@var{arg})
1091 Returns the character defined by @var{arg} which might be a Unicode code point
1092 or a name string if the underlying Lisp provides full Unicode support.
1094 Example: Characters defined by hexadecimal code points
1095 (Maxima with SBCL on GNU/Linux).
1099 (%i2) map(unicode, [24, 0A3, 20AC]);
1100 (%o2) [$, @pounds{}, @euro{}]
1103 Warning: In wxMaxima with SBCL on Windows it is not possible to convert
1104 code points larger than 16 bit to characters when the external format
1105 has not been set to UTF-8. See @ref{adjust_external_format} for more information.
1107 @c Command @U not supported by texinfo 5.
1109 @c (%i3) unicode(1D538);
1113 CMUCL doesn't process code points larger than 16 bit.
1114 In these cases @code{unicode} returns @code{false}.
1115 @c Converting characters to UTF-8 octets and finally to Unicode serves as a workaround.
1116 Converting a code point to a character via UTF-8 octets may serve as a workaround:
1118 @code{octets_to_string(unicode_to_utf8(code_point));}
1120 @c Command @U not supported by texinfo 5.
1122 @c (%i4) octets_to_string(unicode_to_utf8(1D538));
1126 See @ref{octets_to_string}, @ref{unicode_to_utf8}.
1128 In case the underlying Lisp provides full Unicode support the character might be
1129 specified by its name. The following is possible in ECL, CLISP and SBCL,
1130 where in SBCL on Windows the external format has to be set to UTF-8.
1131 @code{unicode(name)} is supported by CMUCL too but again limited to 16 bit
1134 The string argument to @code{unicode} is basically the same string returned by
1135 @code{printf} using the "~@@c" specifier.
1136 But as shown below the prefix "#\" must be omitted.
1137 Underlines might be replaced by spaces and uppercase letters by lowercase ones.
1139 Example (continued): Characters defined by names
1140 (Maxima with SBCL on GNU/Linux).
1143 (%i3) printf(false, "~@@c", unicode(0DF));
1144 (%o3) #\LATIN_SMALL_LETTER_SHARP_S
1145 (%i4) unicode("LATIN_SMALL_LETTER_SHARP_S");
1147 (%i5) unicode("Latin small letter sharp s");
1151 @opencatbox{Categories:}
1152 @category{Package stringproc}
1157 @c -----------------------------------------------------------------------------
1158 @anchor{unicode_to_utf8}
1159 @deffn {Function} unicode_to_utf8 (@var{code_point})
1161 Returns a list containing the UTF-8 code corresponding to the Unicode @var{code_point}.
1163 Examples: Converting Unicode code points to UTF-8 and vice versa.
1166 (%i1) ibase: obase: 16.$
1167 (%i2) map(cint, ["$","@pounds{}","@euro{}"]);
1168 (%o2) [24, 0A3, 20AC]
1169 (%i3) map(unicode_to_utf8, %);
1170 (%o3) [[24], [0C2, 0A3], [0E2, 82, 0AC]]
1171 (%i4) map(utf8_to_unicode, %);
1172 (%o4) [24, 0A3, 20AC]
1175 @opencatbox{Categories:}
1176 @category{Package stringproc}
1181 @c -----------------------------------------------------------------------------
1183 @deffn {Function} uppercasep (@var{char})
1185 Returns @code{true} if @var{char} is an uppercase character.
1187 Note: See remarks on @ref{alphacharp}.
1189 @opencatbox{Categories:}
1190 @category{Predicate functions}
1191 @category{Package stringproc}
1196 @c -----------------------------------------------------------------------------
1197 @anchor{us_ascii_only}
1198 @defvr {Variable} us_ascii_only
1200 This option variable affects Maxima when the character encoding
1201 provided by the application which runs Maxima is UTF-8 but the
1202 external format of the Lisp reader is not equal to UTF-8.
1204 On GNU/Linux this is true when Maxima is built with GCL
1205 and on Windows in wxMaxima with GCL- and SBCL-builds.
1206 With SBCL it is recommended to change the external format to UTF-8.
1207 Setting @code{us_ascii_only} is unnecessary then.
1208 See @ref{adjust_external_format} for details.
1210 @code{us_ascii_only} is @code{false} by default.
1211 Maxima itself then (i.e. in the above described situation) parses the UTF-8 encoding.
1213 When @code{us_ascii_only} is set to @code{true} it is assumed that all strings
1214 used as arguments to string processing functions do not contain Non-US-ASCII characters.
1215 Given that promise, Maxima avoids parsing UTF-8 and strings can be processed more efficiently.
1217 @opencatbox{Categories:}
1218 @category{Global variables}
1219 @category{Package stringproc}
1224 @c -----------------------------------------------------------------------------
1225 @anchor{utf8_to_unicode}
1226 @deffn {Function} utf8_to_unicode (@var{list})
1228 Returns a Unicode code point corresponding to the @var{list} which must contain
1229 the UTF-8 encoding of a single character.
1231 Examples: See @ref{unicode_to_utf8}.
1233 @opencatbox{Categories:}
1234 @category{Package stringproc}
1239 @c -----------------------------------------------------------------------------
1240 @c -----------------------------------------------------------------------------
1241 @node String Processing, Octets and Utilities for Cryptography, Characters
1242 @section String Processing
1244 Position indices in strings are 1-indexed like in Maxima lists.
1245 See example in @ref{charat}.
1247 @c -----------------------------------------------------------------------------
1249 @deffn {Function} charat (@var{string}, @var{n})
1251 Returns the @var{n}-th character of @var{string}.
1252 The first character in @var{string} is returned with @var{n} = 1.
1255 (%i1) charat("Lisp",1);
1257 (%i2) charlist("Lisp")[1];
1261 @opencatbox{Categories:}
1262 @category{Package stringproc}
1267 @c -----------------------------------------------------------------------------
1269 @deffn {Function} charlist (@var{string})
1271 Returns the list of all characters in @var{string}.
1274 (%i1) charlist("Lisp");
1278 @opencatbox{Categories:}
1279 @category{Package stringproc}
1284 @c -----------------------------------------------------------------------------
1285 @anchor{eval_string}
1286 @deffn {Function} eval_string (@var{str})
1288 Parse the string @var{str} as a Maxima expression and evaluate it.
1289 The string @var{str} may or may not have a terminator (dollar sign @code{$} or semicolon @code{;}).
1290 Only the first expression is parsed and evaluated, if there is more than one.
1292 Complain if @var{str} is not a string.
1297 (%i1) eval_string ("foo: 42; bar: foo^2 + baz");
1299 (%i2) eval_string ("(foo: 42, bar: foo^2 + baz)");
1303 See also @ref{parse_string} and @ref{eval_string_lisp}.
1305 @opencatbox{Categories:}
1306 @category{Package stringproc}
1311 @c -----------------------------------------------------------------------------
1312 @anchor{parse_string}
1313 @deffn {Function} parse_string (@var{str})
1315 Parse the string @var{str} as a Maxima expression (do not evaluate it).
1316 The string @var{str} may or may not have a terminator (dollar sign @code{$} or semicolon @code{;}).
1317 Only the first expression is parsed, if there is more than one.
1319 Complain if @var{str} is not a string.
1324 (%i1) parse_string ("foo: 42; bar: foo^2 + baz");
1326 (%i2) parse_string ("(foo: 42, bar: foo^2 + baz)");
1328 (%o2) (foo : 42, bar : foo + baz)
1331 See also @ref{eval_string}.
1333 @opencatbox{Categories:}
1334 @category{Package stringproc}
1339 @c -----------------------------------------------------------------------------
1341 @deffn {Function} scopy (@var{string})
1343 Returns a copy of @var{string} as a new string.
1345 @opencatbox{Categories:}
1346 @category{Package stringproc}
1351 @c -----------------------------------------------------------------------------
1353 @deffn {Function} sdowncase @
1354 @fname{sdowncase} (@var{string}) @
1355 @fname{sdowncase} (@var{string}, @var{start}) @
1356 @fname{sdowncase} (@var{string}, @var{start}, @var{end})
1358 Like @ref{supcase} but uppercase characters are converted to lowercase.
1360 @opencatbox{Categories:}
1361 @category{Package stringproc}
1366 @c -----------------------------------------------------------------------------
1368 @deffn {Function} sequal (@var{string_1}, @var{string_2})
1370 Returns @code{true} if @var{string_1} and @var{string_2} contain the same
1371 sequence of characters.
1373 @opencatbox{Categories:}
1374 @category{Predicate functions}
1375 @category{Package stringproc}
1380 @c -----------------------------------------------------------------------------
1381 @anchor{sequalignore}
1382 @deffn {Function} sequalignore (@var{string_1}, @var{string_2})
1384 Like @code{sequal} but ignores case which is only possible for non-US-ASCII
1385 characters when the underlying Lisp is able to recognize a character as an
1386 alphabetic character. See remarks on @ref{alphacharp}.
1388 @opencatbox{Categories:}
1389 @category{Predicate functions}
1390 @category{Package stringproc}
1395 @c -----------------------------------------------------------------------------
1397 @deffn {Function} sexplode (@var{string})
1399 @code{sexplode} is an alias for function @code{charlist}.
1401 @opencatbox{Categories:}
1402 @category{Package stringproc}
1407 @c -----------------------------------------------------------------------------
1409 @deffn {Function} simplode @
1410 @fname{simplode} (@var{list}) @
1411 @fname{simplode} (@var{list}, @var{delim})
1413 @code{simplode} takes a list of expressions and concatenates them into a string.
1414 If no delimiter @var{delim} is specified, @code{simplode} uses no delimiter.
1415 @var{delim} can be any string.
1417 See also @mrefcomma{concat} @mrefcomma{sconcat} @mref{string} and @mrefdot{printf}
1422 (%i1) simplode(["xx[",3,"]:",expand((x+y)^3)]);
1423 (%o1) xx[3]:y^3+3*x*y^2+3*x^2*y+x^3
1424 (%i2) simplode( sexplode("stars")," * " );
1425 (%o2) s * t * a * r * s
1426 (%i3) simplode( ["One","more","coffee."]," " );
1427 (%o3) One more coffee.
1430 @opencatbox{Categories:}
1431 @category{Package stringproc}
1436 @c -----------------------------------------------------------------------------
1438 @deffn {Function} sinsert (@var{seq}, @var{string}, @var{pos})
1439 Returns a string that is a concatenation of @code{substring(@var{string}, 1, @var{pos}-1)},
1440 the string @var{seq} and @code{substring (@var{string}, @var{pos})}.
1441 Note that the first character in @var{string} is in position 1.
1446 (%i1) s: "A submarine."$
1447 (%i2) concat( substring(s,1,3),"yellow ",substring(s,3) );
1448 (%o2) A yellow submarine.
1449 (%i3) sinsert("hollow ",s,3);
1450 (%o3) A hollow submarine.
1453 @opencatbox{Categories:}
1454 @category{Package stringproc}
1459 @c -----------------------------------------------------------------------------
1460 @anchor{sinvertcase}
1461 @deffn {Function} sinvertcase @
1462 @fname{sinvertcase} (@var{string}) @
1463 @fname{sinvertcase} (@var{string}, @var{start}) @
1464 @fname{sinvertcase} (@var{string}, @var{start}, @var{end})
1466 Returns @var{string} except that each character from position @var{start} to @var{end} is inverted.
1467 If @var{end} is not given,
1468 all characters from @var{start} to the end of @var{string} are replaced.
1473 (%i1) sinvertcase("sInvertCase");
1477 @opencatbox{Categories:}
1478 @category{Package stringproc}
1483 @c -----------------------------------------------------------------------------
1485 @deffn {Function} slength (@var{string})
1487 Returns the number of characters in @var{string}.
1489 @opencatbox{Categories:}
1490 @category{Package stringproc}
1495 @c -----------------------------------------------------------------------------
1497 @deffn {Function} smake (@var{num}, @var{char})
1499 Returns a new string with a number of @var{num} characters @var{char}.
1508 @opencatbox{Categories:}
1509 @category{Package stringproc}
1514 @c -----------------------------------------------------------------------------
1516 @deffn {Function} smismatch @
1517 @fname{smismatch} (@var{string_1}, @var{string_2}) @
1518 @fname{smismatch} (@var{string_1}, @var{string_2}, @var{test})
1520 Returns the position of the first character of @var{string_1} at which @var{string_1} and @var{string_2} differ or @code{false}.
1521 Default test function for matching is @code{sequal}.
1522 If @code{smismatch} should ignore case, use @code{sequalignore} as test.
1527 (%i1) smismatch("seven","seventh");
1531 @opencatbox{Categories:}
1532 @category{Package stringproc}
1537 @c -----------------------------------------------------------------------------
1539 @deffn {Function} split @
1540 @fname{split} (@var{string}) @
1541 @fname{split} (@var{string}, @var{delim}) @
1542 @fname{split} (@var{string}, @var{delim}, @var{multiple})
1544 Returns the list of all tokens in @var{string}.
1545 Each token is an unparsed string.
1546 @code{split} uses @var{delim} as delimiter.
1547 If @var{delim} is not given, the space character is the default delimiter.
1548 @var{multiple} is a boolean variable with @code{true} by default.
1549 Multiple delimiters are read as one.
1550 This is useful if tabs are saved as multiple space characters.
1551 If @var{multiple} is set to @code{false}, each delimiter is noted.
1556 (%i1) split("1.2 2.3 3.4 4.5");
1557 (%o1) [1.2, 2.3, 3.4, 4.5]
1558 (%i2) split("first;;third;fourth",";",false);
1559 (%o2) [first, , third, fourth]
1562 @opencatbox{Categories:}
1563 @category{Package stringproc}
1568 @c -----------------------------------------------------------------------------
1570 @deffn {Function} sposition (@var{char}, @var{string})
1571 Returns the position of the first character in @var{string} which matches @var{char}.
1572 The first character in @var{string} is in position 1.
1573 For matching characters ignoring case see @ref{ssearch}.
1575 @opencatbox{Categories:}
1576 @category{Package stringproc}
1581 @c -----------------------------------------------------------------------------
1583 @deffn {Function} sremove @
1584 @fname{sremove} (@var{seq}, @var{string}) @
1585 @fname{sremove} (@var{seq}, @var{string}, @var{test}) @
1586 @fname{sremove} (@var{seq}, @var{string}, @var{test}, @var{start}) @
1587 @fname{sremove} (@var{seq}, @var{string}, @var{test}, @var{start}, @var{end})
1589 Returns a string like @var{string} but without all substrings matching @var{seq}.
1590 Default test function for matching is @code{sequal}.
1591 If @code{sremove} should ignore case while searching for @var{seq}, use @code{sequalignore} as test.
1592 Use @var{start} and @var{end} to limit searching.
1593 Note that the first character in @var{string} is in position 1.
1598 (%i1) sremove("n't","I don't like coffee.");
1599 (%o1) I do like coffee.
1600 (%i2) sremove ("DO ",%,'sequalignore);
1601 (%o2) I like coffee.
1604 @opencatbox{Categories:}
1605 @category{Package stringproc}
1610 @c -----------------------------------------------------------------------------
1611 @anchor{sremovefirst}
1612 @deffn {Function} sremovefirst @
1613 @fname{sremovefirst} (@var{seq}, @var{string}) @
1614 @fname{sremovefirst} (@var{seq}, @var{string}, @var{test}) @
1615 @fname{sremovefirst} (@var{seq}, @var{string}, @var{test}, @var{start}) @
1616 @fname{sremovefirst} (@var{seq}, @var{string}, @var{test}, @var{start}, @var{end})
1618 Like @code{sremove} except that only the first substring that matches @var{seq} is removed.
1620 @opencatbox{Categories:}
1621 @category{Package stringproc}
1626 @c -----------------------------------------------------------------------------
1628 @deffn {Function} sreverse (@var{string})
1630 Returns a string with all the characters of @var{string} in reverse order.
1632 See also @mrefdot{reverse}
1634 @opencatbox{Categories:}
1635 @category{Package stringproc}
1640 @c -----------------------------------------------------------------------------
1642 @deffn {Function} ssearch @
1643 @fname{ssearch} (@var{seq}, @var{string}) @
1644 @fname{ssearch} (@var{seq}, @var{string}, @var{test}) @
1645 @fname{ssearch} (@var{seq}, @var{string}, @var{test}, @var{start}) @
1646 @fname{ssearch} (@var{seq}, @var{string}, @var{test}, @var{start}, @var{end})
1648 Returns the position of the first substring of @var{string} that matches the string @var{seq}.
1649 Default test function for matching is @code{sequal}.
1650 If @code{ssearch} should ignore case, use @code{sequalignore} as test.
1651 Use @var{start} and @var{end} to limit searching.
1652 Note that the first character in @var{string} is in position 1.
1657 (%i1) ssearch("~s","~@{~S ~@}~%",'sequalignore);
1661 @opencatbox{Categories:}
1662 @category{Package stringproc}
1667 @c -----------------------------------------------------------------------------
1669 @deffn {Function} ssort @
1670 @fname{ssort} (@var{string}) @
1671 @fname{ssort} (@var{string}, @var{test})
1673 Returns a string that contains all characters from @var{string} in an order such there are no two successive characters @var{c} and @var{d} such that @code{test (@var{c}, @var{d})} is @code{false} and @code{test (@var{d}, @var{c})} is @code{true}.
1674 Default test function for sorting is @var{clessp}.
1675 The set of test functions is @code{@{clessp, clesspignore, cgreaterp, cgreaterpignore, cequal, cequalignore@}}.
1680 (%i1) ssort("I don't like Mondays.");
1681 (%o1) '.IMaddeiklnnoosty
1682 (%i2) ssort("I don't like Mondays.",'cgreaterpignore);
1683 (%o2) ytsoonnMlkIiedda.'
1686 @opencatbox{Categories:}
1687 @category{Package stringproc}
1692 @c -----------------------------------------------------------------------------
1694 @deffn {Function} ssubst @
1695 @fname{ssubst} (@var{new}, @var{old}, @var{string}) @
1696 @fname{ssubst} (@var{new}, @var{old}, @var{string}, @var{test}) @
1697 @fname{ssubst} (@var{new}, @var{old}, @var{string}, @var{test}, @var{start}) @
1698 @fname{ssubst} (@var{new}, @var{old}, @var{string}, @var{test}, @var{start}, @var{end})
1700 Returns a string like @var{string} except that all substrings matching @var{old} are replaced by @var{new}.
1701 @var{old} and @var{new} need not to be of the same length.
1702 Default test function for matching is @code{sequal}.
1703 If @code{ssubst} should ignore case while searching for old, use @code{sequalignore} as test.
1704 Use @var{start} and @var{end} to limit searching.
1705 Note that the first character in @var{string} is in position 1.
1710 (%i1) ssubst("like","hate","I hate Thai food. I hate green tea.");
1711 (%o1) I like Thai food. I like green tea.
1712 (%i2) ssubst("Indian","thai",%,'sequalignore,8,12);
1713 (%o2) I like Indian food. I like green tea.
1716 @opencatbox{Categories:}
1717 @category{Package stringproc}
1722 @c -----------------------------------------------------------------------------
1723 @anchor{ssubstfirst}
1724 @deffn {Function} ssubstfirst @
1725 @fname{ssubstfirst} (@var{new}, @var{old}, @var{string}) @
1726 @fname{ssubstfirst} (@var{new}, @var{old}, @var{string}, @var{test}) @
1727 @fname{ssubstfirst} (@var{new}, @var{old}, @var{string}, @var{test}, @var{start}) @
1728 @fname{ssubstfirst} (@var{new}, @var{old}, @var{string}, @var{test}, @var{start}, @var{end})
1730 Like @code{subst} except that only the first substring that matches @var{old} is replaced.
1732 @opencatbox{Categories:}
1733 @category{Package stringproc}
1738 @c -----------------------------------------------------------------------------
1740 @deffn {Function} strim (@var{seq},@var{string})
1742 Returns a string like @var{string},
1743 but with all characters that appear in @var{seq} removed from both ends.
1748 (%i1) "/* comment */"$
1749 (%i2) strim(" /*",%);
1755 @opencatbox{Categories:}
1756 @category{Package stringproc}
1761 @c -----------------------------------------------------------------------------
1763 @deffn {Function} striml (@var{seq}, @var{string})
1765 Like @code{strim} except that only the left end of @var{string} is trimmed.
1767 @opencatbox{Categories:}
1768 @category{Package stringproc}
1773 @c -----------------------------------------------------------------------------
1775 @deffn {Function} strimr (@var{seq}, @var{string})
1777 Like @code{strim} except that only the right end of @var{string} is trimmed.
1779 @opencatbox{Categories:}
1780 @category{Package stringproc}
1785 @c -----------------------------------------------------------------------------
1787 @deffn {Function} stringp (@var{obj})
1789 Returns @code{true} if @var{obj} is a string.
1790 See introduction for example.
1792 @opencatbox{Categories:}
1793 @category{Predicate functions}
1794 @category{Package stringproc}
1799 @c -----------------------------------------------------------------------------
1801 @deffn {Function} substring @
1802 @fname{substring} (@var{string}, @var{start}) @
1803 @fname{substring} (@var{string}, @var{start}, @var{end})
1805 Returns the substring of @var{string} beginning at position @var{start} and ending at position @var{end}.
1806 The character at position @var{end} is not included.
1807 If @var{end} is not given, the substring contains the rest of the string.
1808 Note that the first character in @var{string} is in position 1.
1813 (%i1) substring("substring",4);
1815 (%i2) substring(%,4,6);
1819 @opencatbox{Categories:}
1820 @category{Package stringproc}
1825 @c -----------------------------------------------------------------------------
1827 @deffn {Function} supcase @
1828 @fname{supcase} (@var{string}) @
1829 @fname{supcase} (@var{string}, @var{start}) @
1830 @fname{supcase} (@var{string}, @var{start}, @var{end})
1832 Returns @var{string} except that lowercase characters from position @var{start} to @var{end} are replaced by the corresponding uppercase ones.
1833 If @var{end} is not given,
1834 all lowercase characters from @var{start} to the end of @var{string} are replaced.
1839 (%i1) supcase("english",1,2);
1843 @opencatbox{Categories:}
1844 @category{Package stringproc}
1849 @c -----------------------------------------------------------------------------
1851 @deffn {Function} tokens @
1852 @fname{tokens} (@var{string}) @
1853 @fname{tokens} (@var{string}, @var{test})
1855 Returns a list of tokens, which have been extracted from @var{string}.
1856 The tokens are substrings whose characters satisfy a certain test function.
1857 If test is not given, @var{constituent} is used as the default test.
1858 @code{@{constituent, alphacharp, digitcharp, lowercasep, uppercasep, charp, characterp, alphanumericp@}} is the set of test functions.
1859 (The Lisp-version of @code{tokens} is written by Paul Graham. ANSI Common Lisp, 1996, page 67.)
1864 (%i1) tokens("24 October 2005");
1865 (%o1) [24, October, 2005]
1866 (%i2) tokens("05-10-24",'digitcharp);
1868 (%i3) map(parse_string,%);
1872 @opencatbox{Categories:}
1873 @category{Package stringproc}
1878 @c -----------------------------------------------------------------------------
1879 @c -----------------------------------------------------------------------------
1880 @node Octets and Utilities for Cryptography, Regular Expressions, String Processing
1881 @section Octets and Utilities for Cryptography
1883 @c -----------------------------------------------------------------------------
1885 @deffn {Function} base64 (@var{arg})
1887 Returns the base64-representation of @var{arg} as a string.
1888 The argument @var{arg} may be a string, a non-negative integer or a list of octets.
1893 (%i1) base64: base64("foo bar baz");
1894 (%o1) Zm9vIGJhciBiYXo=
1895 (%i2) string: base64_decode(base64);
1898 (%i4) integer: base64_decode(base64, 'number);
1899 (%o4) 666f6f206261722062617a
1900 (%i5) octets: base64_decode(base64, 'list);
1901 (%o5) [66, 6F, 6F, 20, 62, 61, 72, 20, 62, 61, 7A]
1903 (%i7) base64(octets);
1904 (%o7) Zm9vIGJhciBiYXo=
1907 Note that if @var{arg} contains umlauts (resp. octets larger than 127)
1908 the resulting base64-string is platform dependent.
1909 However the decoded string will be equal to the original.
1911 @opencatbox{Categories:}
1912 @category{Package stringproc}
1917 @c -----------------------------------------------------------------------------
1918 @anchor{base64_decode}
1919 @deffn {Function} base64_decode @
1920 @fname{base64_decode} (@var{base64-string}) @
1921 @fname{base64_decode} (@var{base64-string}, @var{return-type})
1923 By default @code{base64_decode} decodes the @var{base64-string} back to the original string.
1925 The optional argument @var{return-type} allows @code{base64_decode} to
1926 alternatively return the corresponding number or list of octets.
1927 @var{return-type} may be @code{string}, @code{number} or @code{list}.
1929 Example: See @ref{base64}.
1931 @opencatbox{Categories:}
1932 @category{Package stringproc}
1937 @c -----------------------------------------------------------------------------
1939 @deffn {Function} crc24sum @
1940 @fname{crc24sum} (@var{octets}) @
1941 @fname{crc24sum} (@var{octets}, @var{return-type})
1943 By default @code{crc24sum} returns the @code{CRC24} checksum of an octet-list
1946 The optional argument @var{return-type} allows @code{crc24sum} to
1947 alternatively return the corresponding number or list of octets.
1948 @var{return-type} may be @code{string}, @code{number} or @code{list}.
1953 -----BEGIN PGP SIGNATURE-----
1954 Version: GnuPG v2.0.22 (GNU/Linux)
1956 iQEcBAEBAgAGBQJVdCTzAAoJEG/1Mgf2DWAqCSYH/AhVFwhu1D89C3/QFcgVvZTM
1957 wnOYzBUURJAL/cT+IngkLEpp3hEbREcugWp+Tm6aw3R4CdJ7G3FLxExBH/5KnDHi
1958 rBQu+I7+3ySK2hpryQ6Wx5J9uZSa4YmfsNteR8up0zGkaulJeWkS4pjiRM+auWVe
1959 vajlKZCIK52P080DG7Q2dpshh4fgTeNwqCuCiBhQ73t8g1IaLdhDN6EzJVjGIzam
1960 /spqT/sTo6sw8yDOJjvU+Qvn6/mSMjC/YxjhRMaQt9EMrR1AZ4ukBF5uG1S7mXOH
1961 WdiwkSPZ3gnIBhM9SuC076gLWZUNs6NqTeE3UzMjDAFhH3jYk1T7mysCvdtIkms=
1963 -----END PGP SIGNATURE-----
1967 (%i1) ibase : obase : 16.$
1968 (%i2) sig64 : sconcat(
1969 "iQEcBAEBAgAGBQJVdCTzAAoJEG/1Mgf2DWAqCSYH/AhVFwhu1D89C3/QFcgVvZTM",
1970 "wnOYzBUURJAL/cT+IngkLEpp3hEbREcugWp+Tm6aw3R4CdJ7G3FLxExBH/5KnDHi",
1971 "rBQu+I7+3ySK2hpryQ6Wx5J9uZSa4YmfsNteR8up0zGkaulJeWkS4pjiRM+auWVe",
1972 "vajlKZCIK52P080DG7Q2dpshh4fgTeNwqCuCiBhQ73t8g1IaLdhDN6EzJVjGIzam",
1973 "/spqT/sTo6sw8yDOJjvU+Qvn6/mSMjC/YxjhRMaQt9EMrR1AZ4ukBF5uG1S7mXOH",
1974 "WdiwkSPZ3gnIBhM9SuC076gLWZUNs6NqTeE3UzMjDAFhH3jYk1T7mysCvdtIkms=" )$
1975 (%i3) octets: base64_decode(sig64, 'list)$
1976 (%i4) crc24: crc24sum(octets, 'list);
1978 (%i5) base64(crc24);
1982 @opencatbox{Categories:}
1983 @category{Package stringproc}
1988 @c -----------------------------------------------------------------------------
1990 @deffn {Function} md5sum @
1991 @fname{md5sum} (@var{arg}) @
1992 @fname{md5sum} (@var{arg}, @var{return-type})
1994 Returns the @code{MD5} checksum of a string, non-negative integer,
1995 list of octets, or binary (not character) input stream.
1996 A file for which an input stream is opened may be an ordinary text file;
1997 it is the stream which needs to be binary, not the file itself.
1999 When the argument is an input stream,
2000 @code{md5sum} reads the entire content of the stream,
2001 but does not close the stream.
2003 The default return value is a string containing 32 hex characters.
2004 The optional argument @var{return-type} allows @code{md5sum} to alternatively
2005 return the corresponding number or list of octets.
2006 @var{return-type} may be @code{string}, @code{number} or @code{list}.
2008 Note that in case @var{arg} contains German umlauts or other non-ASCII
2009 characters (resp. octets larger than 127) the @code{MD5} checksum is platform dependent.
2014 (%i1) ibase: obase: 16.$
2015 (%i2) msg: "foo bar baz"$
2016 (%i3) string: md5sum(msg);
2017 (%o3) ab07acbb1e496801937adfa772424bf7
2018 (%i4) integer: md5sum(msg, 'number);
2019 (%o4) 0ab07acbb1e496801937adfa772424bf7
2020 (%i5) octets: md5sum(msg, 'list);
2021 (%o5) [0AB,7,0AC,0BB,1E,49,68,1,93,7A,0DF,0A7,72,42,4B,0F7]
2022 (%i6) sdowncase( printf(false, "~@{~2,'0x~^:~@}", octets) );
2023 (%o6) ab:07:ac:bb:1e:49:68:01:93:7a:df:a7:72:42:4b:f7
2026 The argument may be a binary input stream.
2029 (%i1) S: openr_binary (file_search ("md5.lisp"));
2030 (%o1) #<INPUT BUFFERED FILE-STREAM (UNSIGNED-BYTE 8)
2031 /home/robert/maxima/maxima-code/share/stringproc/md5.lisp>
2033 (%o2) 31a512ed53daf5b99495c9d05559355f
2038 @opencatbox{Categories:}
2039 @category{Package stringproc}
2044 @c -----------------------------------------------------------------------------
2046 @deffn {Function} mgf1_sha1 @
2047 @fname{mgf1_sha1} (@var{seed}, @var{len}) @
2048 @fname{mgf1_sha1} (@var{seed}, @var{len}, @var{return-type})
2050 Returns a pseudo random number of variable length.
2051 By default the returned value is a number with a length of @var{len} octets.
2053 The optional argument @var{return-type} allows @code{mgf1_sha1} to alternatively
2054 return the corresponding list of @var{len} octets.
2055 @var{return-type} may be @code{number} or @code{list}.
2057 The computation of the returned value is described in @code{RFC 3447},
2058 appendix @code{B.2.1 MGF1}.
2059 @code{SHA1} is used as hash function, i.e. the randomness of the computed number
2060 relies on the randomness of @code{SHA1} hashes.
2065 (%i1) ibase: obase: 16.$
2066 (%i2) number: mgf1_sha1(4711., 8);
2067 (%o2) 0e0252e5a2a42fea1
2068 (%i3) octets: mgf1_sha1(4711., 8, 'list);
2069 (%o3) [0E0,25,2E,5A,2A,42,0FE,0A1]
2072 @opencatbox{Categories:}
2073 @category{Package stringproc}
2078 @c -----------------------------------------------------------------------------
2079 @anchor{number_to_octets}
2080 @deffn {Function} number_to_octets (@var{number})
2082 Returns an octet-representation of @var{number} as a list of octets.
2083 The @var{number} must be a non-negative integer.
2088 (%i1) ibase : obase : 16.$
2089 (%i2) octets: [0ca,0fe,0ba,0be]$
2090 (%i3) number: octets_to_number(octets);
2092 (%i4) number_to_octets(number);
2093 (%o4) [0CA, 0FE, 0BA, 0BE]
2096 @opencatbox{Categories:}
2097 @category{Package stringproc}
2102 @c -----------------------------------------------------------------------------
2103 @anchor{octets_to_number}
2104 @deffn {Function} octets_to_number (@var{octets})
2106 Returns a number by concatenating the octets in the list of @var{octets}.
2108 Example: See @ref{number_to_octets}.
2110 @opencatbox{Categories:}
2111 @category{Package stringproc}
2116 @c -----------------------------------------------------------------------------
2117 @anchor{octets_to_oid}
2118 @deffn {Function} octets_to_oid (@var{octets})
2120 Computes an object identifier (OID) from the list of @var{octets}.
2122 Example: RSA encryption OID
2125 (%i1) ibase : obase : 16.$
2126 (%i2) oid: octets_to_oid([2A,86,48,86,0F7,0D,1,1,1]);
2127 (%o2) 1.2.840.113549.1.1.1
2128 (%i3) oid_to_octets(oid);
2129 (%o3) [2A, 86, 48, 86, 0F7, 0D, 1, 1, 1]
2132 @opencatbox{Categories:}
2133 @category{Package stringproc}
2138 @c -----------------------------------------------------------------------------
2139 @anchor{octets_to_string}
2140 @deffn {Function} octets_to_string @
2141 @fname{octets_to_string} (@var{octets}) @
2142 @fname{octets_to_string} (@var{octets}, @var{encoding})
2144 Decodes the list of @var{octets} into a string according to current system defaults.
2145 When decoding octets corresponding to Non-US-ASCII characters
2146 the result depends on the platform, application and underlying Lisp.
2148 Example: Using system defaults
2149 (Maxima compiled with GCL, which uses no format definition and
2150 simply passes through the UTF-8-octets encoded by the GNU/Linux terminal).
2153 (%i1) octets: string_to_octets("abc");
2155 (%i2) octets_to_string(octets);
2157 (%i3) ibase: obase: 16.$
2158 (%i4) unicode(20AC);
2160 (%i5) octets: string_to_octets(%);
2161 (%o5) [0E2, 82, 0AC]
2162 (%i6) octets_to_string(octets);
2164 (%i7) utf8_to_unicode(octets);
2168 In case the external format of the Lisp reader is equal to UTF-8 the optional
2169 argument @var{encoding} allows to set the encoding for the octet to string conversion.
2170 If necessary see @ref{adjust_external_format} for changing the external format.
2172 Some names of supported encodings (see corresponding Lisp manual for more): @*
2173 CCL, CLISP, SBCL: @code{utf-8, ucs-2be, ucs-4be, iso-8859-1, cp1252, cp850} @*
2174 CMUCL: @code{utf-8, utf-16-be, utf-32-be, iso8859-1, cp1252} @*
2175 ECL: @code{utf-8, ucs-2be, ucs-4be, iso-8859-1, windows-cp1252, dos-cp850}
2177 Example (continued): Using the optional encoding argument
2178 (Maxima compiled with SBCL, GNU/Linux terminal).
2181 (%i8) string_to_octets("@euro{}", "ucs-2be");
2185 @opencatbox{Categories:}
2186 @category{Package stringproc}
2191 @c -----------------------------------------------------------------------------
2192 @anchor{oid_to_octets}
2193 @deffn {Function} oid_to_octets (@var{oid-string})
2195 Converts an object identifier (OID) to a list of @var{octets}.
2197 Example: See @ref{octets_to_oid}.
2199 @opencatbox{Categories:}
2200 @category{Package stringproc}
2205 @c -----------------------------------------------------------------------------
2207 @deffn {Function} sha1sum @
2208 @fname{sha1sum} (@var{arg}) @
2209 @fname{sha1sum} (@var{arg}, @var{return-type})
2211 Returns the @code{SHA1} fingerprint of a string, a non-negative integer or
2212 a list of octets. The default return value is a string containing 40 hex characters.
2214 The optional argument @var{return-type} allows @code{sha1sum} to alternatively
2215 return the corresponding number or list of octets.
2216 @var{return-type} may be @code{string}, @code{number} or @code{list}.
2221 (%i1) ibase: obase: 16.$
2222 (%i2) msg: "foo bar baz"$
2223 (%i3) string: sha1sum(msg);
2224 (%o3) c7567e8b39e2428e38bf9c9226ac68de4c67dc39
2225 (%i4) integer: sha1sum(msg, 'number);
2226 (%o4) 0c7567e8b39e2428e38bf9c9226ac68de4c67dc39
2227 (%i5) octets: sha1sum(msg, 'list);
2228 (%o5) [0C7,56,7E,8B,39,0E2,42,8E,38,0BF,9C,92,26,0AC,68,0DE,4C,67,0DC,39]
2229 (%i6) sdowncase( printf(false, "~@{~2,'0x~^:~@}", octets) );
2230 (%o6) c7:56:7e:8b:39:e2:42:8e:38:bf:9c:92:26:ac:68:de:4c:67:dc:39
2233 Note that in case @var{arg} contains German umlauts or other non-ASCII
2234 characters (resp. octets larger than 127) the @code{SHA1} fingerprint is platform dependent.
2236 @opencatbox{Categories:}
2237 @category{Package stringproc}
2242 @c -----------------------------------------------------------------------------
2244 @deffn {Function} sha256sum @
2245 @fname{sha256sum} (@var{arg}) @
2246 @fname{sha256sum} (@var{arg}, @var{return-type})
2248 Returns the @code{SHA256} fingerprint of a string, a non-negative integer or
2249 a list of octets. The default return value is a string containing 64 hex characters.
2251 The optional argument @var{return-type} allows @code{sha256sum} to alternatively
2252 return the corresponding number or list of octets (see @ref{sha1sum}).
2257 (%i1) string: sha256sum("foo bar baz");
2258 (%o1) dbd318c1c462aee872f41109a4dfd3048871a03dedd0fe0e757ced57dad6f2d7
2261 Note that in case @var{arg} contains German umlauts or other non-ASCII
2262 characters (resp. octets larger than 127) the @code{SHA256} fingerprint is platform dependent.
2264 @opencatbox{Categories:}
2265 @category{Package stringproc}
2270 @c -----------------------------------------------------------------------------
2271 @anchor{string_to_octets}
2272 @deffn {Function} string_to_octets @
2273 @fname{string_to_octets} (@var{string}) @
2274 @fname{string_to_octets} (@var{string}, @var{encoding})
2276 Encodes a @var{string} into a list of octets according to current system defaults.
2277 When encoding strings containing Non-US-ASCII characters
2278 the result depends on the platform, application and underlying Lisp.
2280 In case the external format of the Lisp reader is equal to UTF-8 the optional
2281 argument @var{encoding} allows to set the encoding for the string to octet conversion.
2282 If necessary see @ref{adjust_external_format} for changing the external format.
2284 See @ref{octets_to_string} for examples and some more information.
2286 @opencatbox{Categories:}
2287 @category{Package stringproc}
2292 @node Regular Expressions, , Octets and Utilities for Cryptography
2293 @section Regular Expressions
2295 @node Introduction to Regular Expressions, Functions and Variables, Regular Expressions, Regular Expressions
2296 @subsection Introduction to Regular Expressions
2298 @code{sregex} is an interface to the portable regex engine by Dorai
2299 Sitaram. The syntax of the regular expressions is described in detail
2300 in the @url{http://ds26gte.github.io/pregexp/index.html, pregexp
2301 manual} by Dorai Sitaram. See the manual for full details.
2303 While @code{sregex} supports Unicode, the support for Unicode characters in
2304 strings is dependent on the support for Unicode characters in the Lisp
2307 @node Functions and Variables, , Introduction to Regular Expressions, Regular Expressions
2308 @subsection Functions and Variables
2310 @anchor{regex_compile}
2311 @deffn {Function} regex_compile (@var{pattern})
2312 Compile regex string in @var{pattern} to an internal form that is
2313 easier for the regex engine to process. This is not required,
2314 however. All the regex functions accept this compiled regex or a
2315 string. If the pattern is used many times, compiling the pattern
2316 will speed up matching.
2319 @c regex_compile("c.r");
2323 (%i1) regex_compile("c.r");
2324 (%o1) Structure [COMPILED-REGEX for "c.r"]
2330 @anchor{regex_match_pos}
2331 @deffn {Function} regex_match_pos (@var{regex}, @var{str})
2332 @deffnx {Function} regex_match_pos (@var{regex}, @var{str}, @var{start})
2333 @deffnx {Function} regex_match_pos (@var{regex}, @var{str}, @var{start}, @var{end})
2334 Return a list consisting of a list of the start and end positions of
2335 @var{str} where the first match of @var{regex} occurred. If no match
2336 is found, returns @code{false}.
2338 If a third argument, @var{start}, is supplied, it is the starting index
2339 of the text string @var{str}. The fourth argument, @var{end}, is the
2340 ending index of text string @var{str}.
2343 @c str : "his hay needle stack -- my hay needle stack -- her hay needle stack"$
2344 @c regex : regex_compile("ne{2}dle")$
2345 @c regex_match_pos(regex, str);
2346 @c regex_match_pos("ne{2}dle", str);
2347 @c regex_match_pos("ne{2}dle", str, 25, 44);
2350 (%i1) str : "his hay needle stack -- my hay needle stack -- her hay needle stack"$
2351 (%i2) regex : regex_compile("ne@{2@}dle")$
2353 (%i3) regex_match_pos(regex, str);
2357 (%i4) regex_match_pos("ne@{2@}dle", str);
2361 (%i5) regex_match_pos("ne@{2@}dle", str, 25, 44);
2366 Here is an example where @code{regex_match_pos} returns a list of more
2369 @c str : "jan 1, 1970";
2370 @c match: regex_match_pos("([a-z]+) ([0-9]+), ([0-9]+)", "jan 1, 1970");
2371 @c map(lambda([posn], substring(str, posn[1], posn[2])), match);
2375 (%i1) str : "jan 1, 1970";
2379 (%i2) match: regex_match_pos("([a-z]+) ([0-9]+), ([0-9]+)", "jan 1, 1970");
2380 (%o2) [[1, 12], [1, 4], [5, 6], [8, 12]]
2383 (%i3) map(lambda([posn], substring(str, posn[1], posn[2])), match);
2384 (%o3) [jan 1, 1970, jan, 1, 1970]
2388 The first element is for the full match. Each subsequent element of
2389 the list is the substring that matches the @emph{cluster} enclosed in
2390 parenthesis in the given regular expression.
2392 @opencatbox{Categories:}
2393 @category{Package stringproc}
2398 @anchor{regex_match}
2399 @deffn {Function} regex_match (@var{regex}, @var{str})
2400 @deffnx {Function} regex_match (@var{regex}, @var{str}, @var{start})
2401 @deffnx {Function} regex_match (@var{regex}, @var{str}, @var{start}, @var{end})
2402 @code{regex_match} is very similar to @code{regex_match_pos} except
2403 that it returns the matching substrings instead of the indices of the
2404 match. If no match is found, returns @code{false}.
2407 @c regex_match("ne{2}dle", "hay needle stack");
2408 @c regex_match("ne{2}dle", "hay needle stack", 10);
2412 (%i1) regex_match("ne@{2@}dle", "hay needle stack");
2416 (%i2) regex_match("ne@{2@}dle", "hay needle stack", 10);
2421 Here is examples using POSIX character classes. @code{[:alpha:]}
2422 matches any letter. The pattern matches any letter or underscore:
2424 @c regex_match("[[:alpha:]_]", "--x--");
2425 @c regex_match("[[:alpha:]_]", "--_--");
2426 @c regex_match("[[:alpha:]_]", "--:--");
2430 (%i1) regex_match("[[:alpha:]_]", "--x--");
2434 (%i2) regex_match("[[:alpha:]_]", "--_--");
2438 (%i3) regex_match("[[:alpha:]_]", "--:--");
2443 @code{sregex} supports @emph{clusters} (see
2444 @url{https://ds26gte.github.io/pregexp/index.html#TAG:__tex2page_toc_TAG:__tex2page_sec_3.4,
2445 pregexp clusters}) which are subpatterns denoted
2446 by being enclosed within parentheses. These cause the matcher to
2447 return the submatch along with the overall match.
2449 Here we are looking for any number of letters followed by a space, any
2450 number of digits, a comma and space, then any number of digits.
2452 @c regex_match("([a-z]+) ([0-9]+), ([0-9]+)", "jan 1, 1970");
2456 (%i1) regex_match("([a-z]+) ([0-9]+), ([0-9]+)", "jan 1, 1970");
2457 (%o1) [jan 1, 1970, jan, 1, 1970]
2460 The result is a list of strings. The first element is the full match.
2461 The second matches @code{"([a-z]+)"}, which is a cluster of any number
2462 of letters. Hence, @code{"jan"} matches this cluster. Likewise for
2465 A more complicated example illustrates how a subpattern fails to
2466 match, but the overall pattern matches. In this case, @code{false}
2467 represents to failed match.
2469 The regex pattern matches ``month year'' or ``month day, year''. The
2470 subpattern matches the day, if present.
2472 @c date_re : regex_compile("([a-z]+) +([0-9]+,)? *([0-9]+)");
2473 @c regex_match(date_re, "jan 1, 1970");
2474 @c regex_match(date_re, "jan 1970");
2478 (%i1) date_re : regex_compile("([a-z]+) +([0-9]+,)? *([0-9]+)");
2480 Structure [COMPILED-REGEX for "([a-z]+) +([0-9]+,)? *([0-9]+)"]
2483 (%i2) regex_match(date_re, "jan 1, 1970");
2484 (%o2) [jan 1, 1970, jan, 1,, 1970]
2487 (%i3) regex_match(date_re, "jan 1970");
2488 (%o3) [jan 1970, jan, false, 1970]
2492 You can also do case-insensitve matches by using a @emph{cloister}
2494 @url{https://ds26gte.github.io/pregexp/index.html#TAG:__tex2page_toc_TAG:__tex2page_sec_3.4.3,
2496 with the @code{i} modifier:
2499 @c regex_match("hearth", "HeartH");
2500 @c regex_match("(?i:hearth)", "HeartH");
2504 (%i1) regex_match("hearth", "HeartH");
2508 (%i2) regex_match("(?i:hearth)", "HeartH");
2513 Alternate subpatterns can be separated by @code{|}.
2515 @c regex_match("f(ee|i|o|um)", "a small, final fee");
2519 (%i1) regex_match("f(ee|i|o|um)", "a small, final fee");
2523 The first element is the full match @code{"fi"}; the second shows
2524 that we matched @code{"i"} for the cluster.
2527 @opencatbox{Categories:}
2528 @category{Package stringproc}
2533 @anchor{regex_split}
2534 @deffn {Function} regex_split (@var{regex}, @var{str})
2535 Returns a list of strings where @var{str} has been split into
2536 substrings where the @var{regex} identifies the delimiters to use for
2537 separating the substrings.
2540 @c regex_split("[,;]+", "split,pea;;;soup");
2544 (%i1) regex_split("[,;]+", "split,pea;;;soup");
2545 (%o1) [split, pea, soup]
2549 @opencatbox{Categories:}
2550 @category{Package stringproc}
2555 @anchor{regex_subst_first}
2556 @deffn {Function} regex_subst_first (@var{replacement}, @var{pattern}, @var{str})
2557 Returns a string where the first occurrence of @var{pattern} in
2558 @var{str} with @var{replacement}.
2561 @c regex_subst_first("ty", "t.", "liberte egalite fraternite");
2565 (%i1) regex_subst_first("ty", "t.", "liberte egalite fraternite");
2566 (%o1) liberty egalite fraternite
2570 This example shows how to use back references. The replacement
2571 specifies that the first submatch is used as the replacment text.
2573 @c regex_match("_(.+?)_", "the _nina_, the _pinta_, and the _santa maria_");
2574 @c regex_subst_first("*\\1*", "_(.+?)_", "the _nina_, the _pinta_, and the _santa maria_");
2578 (%i1) regex_match("_(.+?)_", "the _nina_, the _pinta_, and the _santa maria_");
2579 (%o1) [_nina_, nina]
2582 (%i2) regex_subst_first("*\\1*", "_(.+?)_", "the _nina_, the _pinta_, and the _santa maria_");
2583 (%o2) the *nina*, the _pinta_, and the _santa maria_
2587 @opencatbox{Categories:}
2588 @category{Package stringproc}
2593 @anchor{regex_subst}
2594 @deffn {Function} regex_subst (@var{replacement}, @var{pattern}, @var{str})
2595 Returns a string where every occurrence of @var{pattern} has been
2596 replaced by @var{replacement} in the string @var{str}.
2599 @c regex_subst("ty", "t.\\b", "liberte egalite fraternite");
2603 (%i1) regex_subst("ty", "t.\\b", "liberte egalite fraternite");
2604 (%o1) liberty egality fraternity
2608 @opencatbox{Categories:}
2609 @category{Package stringproc}
2614 @anchor{string_to_regex}
2615 @deffn {Function} string_to_regex (@var{str})
2616 Returns a regex string where any special reqex characters in @var{str}
2617 are quoted to remove the specialness of the character.
2620 @c re : string_to_regex(". :");
2621 @c regex_match(re, "z :");
2622 @c regex_match(re, ". :");
2623 @c regex_match(". :", "z :");
2627 (%i1) re : string_to_regex(". :");
2631 (%i2) regex_match(re, "z :");
2635 (%i3) regex_match(re, ". :");
2639 (%i4) regex_match(". :", "z :");
2644 In this example, the regex will only match a substring consisting of a
2645 period, followed by a space and a colon. Without the quoting, the
2646 @code{"."} would match any single character.
2648 @opencatbox{Categories:}
2649 @category{Package stringproc}