1 <!DOCTYPE HTML PUBLIC
"-//IETF//DTD HTML 3.2//EN">
4 <title>HTMLArea Spell Checker
</title>
8 <h1>HTMLArea Spell Checker
</h1>
10 <p>The HTMLArea Spell Checker subsystem consists of the following
15 <li>spell-checker.js
— the spell checker plugin interface for
18 <li>spell-checker-ui.html
— the HTML code for the user
21 <li>spell-checker-ui.js
— functionality of the user
24 <li>spell-checker-logic.cgi
— Perl CGI script that checks a text
25 given through POST for spelling errors
</li>
27 <li>spell-checker-style.css
— style for mispelled words
</li>
29 <li>lang/en.js
— main language file (English).
</li>
33 <h2>Process overview
</h2>
36 When an end-user clicks the
"spell-check" button in the HTMLArea
37 editor, a new window is opened with the URL of
"spell-check-ui.html".
38 This window initializes itself with the text found in the editor (uses
39 <tt>window.opener.SpellChecker.editor
</tt> global variable) and it
40 submits the text to the server-side script
"spell-check-logic.cgi".
41 The target of the FORM is an inline frame which is used both to
42 display the text and correcting.
46 Further, spell-check-logic.cgi calls Aspell for each portion of plain
47 text found in the given HTML. It rebuilds an HTML file that contains
48 clear marks of which words are incorrect, along with suggestions for
49 each of them. This file is then loaded in the inline frame. Upon
50 loading, a JavaScript function from
"spell-check-ui.js" is called.
51 This function will retrieve all mispelled words from the HTML of the
52 iframe and will setup the user interface so that it allows correction.
55 <h2>The server-side script (spell-check-logic.cgi)
</h2>
58 <strong>Unicode safety
</strong> — the program
<em>is
</em>
59 Unicode safe. HTML entities are expanded into their corresponding
60 Unicode characters. These characters will be matched as part of the
61 word passed to Aspell. All texts passed to Aspell are in Unicode
62 (when appropriate). However, Aspell seems to not support Unicode
64 href=
"http://mail.gnu.org/archive/html/aspell-user/2000-11/msg00007.html">thread concerning Aspell and Unicode
</a>).
65 This mean that words containing Unicode
66 characters that are not in
0.
.255 are likely to be reported as
"mispelled" by Aspell.
70 I digged the Net for a couple of hours today and I can't seem to find
71 any open-source spell checker that has Unicode support. For this
72 reason we keep using Aspell, because it also seems to have the
73 best suggestions engine. Unicode support will eventually be
74 implemented in Aspell.
<a href=
"mailto:kevin@atkinson.dhs.org">Email
75 Kevin Atkinson
</a> (Aspell author and maintainer) about this ;-)
79 The Perl Unicode manual (man perluniintro) states:
84 Starting from Perl
5.6.0, Perl has had the capacity to handle Unicode
85 natively. Perl
5.8.0, however, is the first recommended release for
86 serious Unicode work. The maintenance release
5.6.1 fixed many of the
87 problems of the initial Unicode implementation, but for example regular
88 expressions still do not work with Unicode in
5.6.1.
92 <p>In other words, do
<em>not
</em> assume that this script is
93 Unicode-safe on Perl interpreters older than
5.8.0.
</p>
95 <p>The following Perl modules are required:
</p>
98 <li><a href=
"http://search.cpan.org/search?query=Text%3A%3AAspell&mode=all" target=
"_blank">Text::Aspell
</a></li>
99 <li><a href=
"http://search.cpan.org/search?query=HTML%3A%3AParser&mode=all" target=
"_blank">HTML::Parser
</a></li>
100 <li><a href=
"http://search.cpan.org/search?query=HTML%3A%3AEntities&mode=all" target=
"_blank">HTML::Entities
</a></li>
101 <li><a href=
"http://search.cpan.org/search?query=CGI&mode=all" target=
"_blank">CGI
</a></li>
104 <p>Of these, only Text::Aspell might need to be installed manually. The
105 others are likely to be available by default in most Perl distributions.
</p>
108 <address><a href=
"http://students.infoiasi.ro/~mishoo/">Mihai Bazon
</a></address>
109 <!-- Created: Thu Jul 17 13:22:27 EEST 2003 -->
111 Last modified on Sun Aug
10 12:
28:
24 2003
113 <!-- doc-lang: English -->