1 Vis a vim-like text editor
2 ==========================
4 [![Build status](https://travis-ci.org/martanne/vis.svg?branch=master)](https://travis-ci.org/martanne/vis)
5 [![Coverity Scan Build Status](https://scan.coverity.com/projects/3939/badge.svg)](https://scan.coverity.com/projects/3939)
6 [![#vis-editor on freenode](https://www.irccloud.com/invite-svg?channel=%23vis-editor&hostname=irc.freenode.net&port=6697&ssl=1)](irc://irc.freenode.net/vis-editor)
8 Vis aims to be a modern, legacy free, simple yet efficient vim-like editor.
10 It extends vim's modal editing with built-in support for multiple
11 cursors/selections and combines it with [sam's](http://sam.cat-v.org/)
12 [structural regular expression](http://doc.cat-v.org/bell_labs/structural_regexps/)
13 based [command language](http://doc.cat-v.org/bell_labs/sam_lang_tutorial/).
15 As an universal editor it has decent Unicode support (including double width
16 and combining characters) and should cope with arbitrary files including:
18 - large (up to a few Gigabytes) ones including
19 - Wikipedia/OpenStreetMap XML / SQL / CSV dumps
20 - amalgamated source trees (e.g. SQLite)
21 - single line ones e.g. minified JavaScript
22 - binary ones e.g. ELF files
24 Efficient syntax highlighting is provided using Parsing Expression Grammars
25 which can be conveniently expressed using Lua in form of LPeg.
27 The editor core is written in a reasonable amount of clean (your mileage
28 may vary), modern and legacy free C code enabling it to run in resource
29 constrained environments. The implementation should be easy to hack on
30 and encourage experimentation (e.g. native built in support for multiple
31 cursors). There also exists a Lua API for in process extensions.
33 Vis strives to be *simple* and focuses on its core task: efficient text
34 management. As an example the file open dialog is provided by an independent
35 utility. There exist plans to use a client/server architecture, delegating
36 window management to your windowing system or favorite terminal multiplexer.
38 The intention is *not* to be bug for bug compatible with vim, instead a
39 similar editing experience should be provided. The goal could thus be
40 summarized as "80% of vim's features implemented in roughly 1% of the code".
42 [![vis demo](https://asciinema.org/a/41361.png)](https://asciinema.org/a/41361)
44 Getting started / Build instructions
45 ====================================
47 In order to build vis you will need a C99 compiler as well as:
49 * a C library, we recommend [musl](http://www.musl-libc.org/)
50 * [libcurses](http://www.gnu.org/software/ncurses/), preferably in the
51 wide-character version
52 * [libtermkey](http://www.leonerd.org.uk/code/libtermkey/)
53 * [lua](http://www.lua.org/) >= 5.2 (optional)
54 * [LPeg](http://www.inf.puc-rio.br/~roberto/lpeg/) >= 0.12
55 (optional runtime dependency required for syntax highlighting)
57 Assuming these dependencies are met, execute:
59 $ ./configure && make && sudo make install
61 By default the `configure` script will try to auto detect support for
62 Lua. See `configure --help` for a list of supported options. You can
63 also manually tweak the generated `config.mk` file.
65 On Linux based systems `make standalone` will attempt to download,
66 compile and install all of the above dependencies into a subfolder
67 inorder to build a self contained statically linked binary.
69 `make local` will do the same but only for libtermkey, Lua and LPeg
70 (i.e. the system C and curses libraries are used).
72 Or simply use one of the distribution provided packages:
74 * [ArchLinux](http://www.archlinux.org/packages/?q=vis)
75 * [Alpine Linux](https://pkgs.alpinelinux.org/packages?name=vis)
76 * [NixOS](https://github.com/NixOS/nixpkgs/blob/master/pkgs/applications/editors/vis/default.nix)
77 * [Source Mage GNU/Linux](http://download.sourcemage.org/grimoire/codex/test/editors/vis)
78 * [Void Linux](https://github.com/voidlinux/void-packages/tree/master/srcpkgs/vis)
79 * [pkgsrc](http://pkgsrc.se/wip/vis-editor)
84 The following section gives a quick overview over the currently
92 = (indent, currently an alias for gq)
93 gq (format using fmt(1))
103 Operators can be forced to work line wise by specifying `V`.
108 b (previous start of a word)
109 B (previous start of a WORD)
111 e (next end of a word)
112 E (next end of a WORD)
113 F{char} (to next occurrence of char to the left)
114 f{char} (to next occurrence of char to the right)
115 ^ (first non-blank of line)
116 g0 (begin of display line)
117 g$ (end of display line)
118 ge (previous end of a word)
119 gE (previous end of a WORD)
121 G (goto line or end of file)
122 gj (display line down)
124 g_ (last non-blank of line)
125 gm (middle of display line)
128 H (goto top/home line of window)
132 L (goto bottom/last line of window)
134 '{mark} (go to start of line containing mark)
136 M (goto middle line of window)
137 ]] (next end of C-like function)
140 ][ (next start of C-like function)
141 N (repeat last search backwards)
142 n (repeat last search forward)
143 [] (previous end of C-like function)
144 [{ (previous start of block)
145 ]} (next start of block)
146 [( (previous start of parenthese pair)
147 ]) (next start of parenthese pair)
148 { (previous paragraph)
149 ( (previous sentence)
150 [[ (previous start of C-like function)
151 ; (repeat last to/till movement)
152 , (repeat last to/till movement but in opposite direction)
153 # (search word under cursor backwards)
154 * (search word under cursor forwards)
155 T{char} (till before next occurrence of char to the left)
156 t{char} (till before next occurrence of char to the right)
157 ?{text} (to next match of text in backward direction)
158 /{text} (to next match of text in forward direction)
159 w (next start of a word)
160 W (next start of a WORD)
162 An empty line is currently neither a word nor a WORD.
164 Some of these commands do not work as in vim when prefixed with a
165 digit i.e. a multiplier. As an example in vim `3$` moves to the end
166 of the 3rd line down. However vis treats it as a move to the end of
167 current line which is repeated 3 times where the last two have no
172 All of the following text objects are implemented in an inner variant
173 (prefixed with `i`) and a normal variant (prefixed with `a`):
179 [,], (,), {,}, <,>, ", ', ` block enclosed by these symbols
181 For sentence and paragraph there is no difference between the
182 inner and normal variants.
184 gn matches the last used search term in forward direction
185 gN matches the last used search term in backward direction
187 Additionally the following text objects, which are not part of stock vim
190 ae entire file content
191 ie entire file content except for leading and trailing empty lines
192 af C-like function definition including immediately preceding comments
193 if C-like function definition only function body
195 il current line without leading and trailing white spaces
199 Vis implements more or less functional normal, operator-pending, insert,
200 replace and visual (in both line and character wise variants) modes.
202 Visual block mode is not implemented and there exists no immediate
203 plan to do so. Instead vis has built in support for multiple cursors.
205 Command mode is implemented as a regular file. Use the full power of the
206 editor to edit your commands / search terms.
208 Ex mode is deliberately not implemented, instead a variant of the structural
209 regular expression based command language of `sam(1)` is supported.
211 ### Multiple Cursors / Selections
213 vis supports multiple cursors with immediate visual feedback (unlike
214 in the visual block mode of vim where for example inserts only become
215 visible upon exit). There always exists one primary cursor located
216 within the current view port. Additional cursors ones can be created
217 as needed. If more than one cursor exists, the primary one is blinking.
219 To manipulate multiple cursors use in normal mode:
221 Ctrl-K create count new cursors on the lines above
222 Ctrl-Meta-K create count new cursors on the lines above the first cursor
223 Ctrl-J create count new cursors on the lines below
224 Ctrl-Meta-J create count new cursors on the lines below the last cursor
225 Ctrl-P remove primary cursor
226 Ctrl-N select word the cursor is currently over, switch to visual mode
227 Ctrl-U make the count previous cursor primary
228 Ctrl-D make the count next cursor primary
229 Ctrl-C remove the count cursor column
230 Ctrl-L remove all but the count cursor column
231 Tab try to align all cursor on the same column
232 Esc dispose all but the primary cursor
234 Visual mode was enhanced to recognize:
236 I create a cursor at the start of every selected line
237 A create a cursor at the end of every selected line
238 Tab left align selections by inserting spaces
239 Shift-Tab right align selections by inserting spaces
240 Ctrl-N create new cursor and select next word matching current selection
241 Ctrl-X clear (skip) current selection, but select next matching word
242 Ctrl-P remove primary cursor
243 Ctrl-U/K make the count previous cursor primary
244 Ctrl-D/J make the count next cursor primary
245 Ctrl-C remove the count cursor column
246 Ctrl-L remove all but the count cursor column
247 + rotates selections rightwards count times
248 - rotates selections leftwards count times
249 \ trim selections, remove leading and trailing white space
250 Esc clear all selections, switch to normal mode
252 In insert/replace mode
254 Shift-Tab align all cursors by inserting spaces
258 [a-z] general purpose marks
259 < start of the last selected visual area in current buffer
260 > end of the last selected visual area in current buffer
262 No marks across files are supported. Marks are not preserved over
267 Supported registers include:
269 "a-"z general purpose registers
270 "A-"Z append to corresponding general purpose register
271 "*, "+ system clipboard integration via shell script vis-clipboard
275 "_ black hole (/dev/null) register
277 If no explicit register is specified a default register is used.
279 ### Undo/Redo and Repeat
281 The text is currently snapshotted whenever an operator is completed as
282 well as when insert or replace mode is left. Additionally a snapshot
283 is also taken if in insert or replace mode a certain idle time elapses.
285 Another idea is to snapshot based on the distance between two consecutive
286 editing operations (as they are likely unrelated and thus should be
287 individually reversible).
289 Besides the regular undo functionality, the key bindings `g+` and `g-`
290 traverse the history in chronological order. Further more the `:earlier`
291 and `:later` commands provide means to restore the text to an arbitrary
294 The repeat command `.` works for all operators and is able to repeat
295 the last insertion or replacement.
299 The general purpose registers `[a-z]` can be used to record macros. Use
300 one of `[A-Z]` to append to an existing macro. `q` starts a recording,
301 `@` plays it back. `@@` refers to the least recently recorded macro.
302 `@:` repeats the last :-command. `@/` is equivalent to `n` in normal mode.
304 ### Structural Regular Expression based Command Language
306 Vis supports [sam's](http://sam.cat-v.org/)
307 [structural regular expression](http://doc.cat-v.org/bell_labs/structural_regexps/)
308 based [command language](http://doc.cat-v.org/bell_labs/sam_lang_tutorial/).
310 The basic command syntax supported is mostly compatible with the description
311 found in the [sam manual page](http://man.cat-v.org/plan_9/1/sam).
312 The [sam reference card](http://sam.cat-v.org/cheatsheet/) might also be useful.
314 Sam commands can be entered from the vis prompt as `:<cmd>`
316 A command behaves differently depending on the mode in which it is issued:
318 - in visual mode it behaves as if an implicit extract x command
319 matching the current selection(s) would be preceding it. That is
320 the command is executed once for each selection.
324 * if an address for the command was provided it is evaluated starting
325 from the current cursor position(s) i.e. dot is set to the current
328 * if no address was supplied to the command then:
330 + if multiple cursors exist, the command is executed once for every
331 cursor with dot set to the current line of the cursor
333 + otherwise if there is only 1 cursor then the command is executed
334 with dot set to the whole file
336 The command syntax was slightly tweaked to accept more terse commands.
338 - When specifying text or regular expressions the trailing delimiter can
339 be elided if the meaning is unambiguous.
341 - If only an address is provided the print command will be executed.
343 - The print command creates a selection matching its range.
345 - In text entry `\t` inserts a literal tab character (sam only recognizes `\n`).
347 Hence the sam command `,x/pattern/` can be abbreviated to `x/pattern`
349 If after a command no selections remain, the editor will switch to normal
350 mode otherwise it remains in visual mode.
352 Other differences compared to sam include:
354 - The following commands are deliberately not implemented:
358 * print line address `=`
359 * print character address `=#`
360 * set current file mark `k`
363 - Multi file support is currently very primitive:
365 * the "regexp" construct to evaluate an address in a file matching
366 regexp is currently not supported.
368 * the following commands related to multiple file editing are not
369 supported: `b`, `B`, `n`, `D`, `f`.
371 - The special grouping semantics where all commands of a group operate
372 on the the same state is not implemented.
374 - The file mark address `'` (and corresponding `k` command) is not supported
376 ### Command line prompt
378 Besides the sam command language the following commands are also recognized
379 at the `:`-command prompt. Any unique prefix can be used.
381 :bdelete close all windows which display the same file as the current one
382 :earlier revert to older text state
383 :e replace current file with a new one or reload it from disk
384 :langmap set key equivalents for layout specific key mappings
385 :later revert to newer text state
386 :! launch external command, redirect keyboard input to it
387 :map add a global key mapping
388 :map-window add a window local key mapping
389 :new open an empty window, arrange horizontally
390 :open open a new window
391 :qall close all windows, exit editor
392 :q close currently focused window
393 :r insert content of another file at current cursor position
394 :set set the options below
395 :split split window horizontally
396 :s search and replace currently implemented in terms of `sed(1)`
397 :unmap remove a global key mapping
398 :unmap-window remove a window local key mapping
399 :vnew open an empty window, arrange vertically
400 :vsplit split window vertically
401 :wq write changes then close window
402 :w write current buffer content to file
404 tabwidth [1-8] default 8
406 set display width of a tab and number of spaces to use if
409 expandtab (yes|no) default no
411 whether typed in tabs should be expanded to tabwidth spaces
413 autoindent (yes|no) default no
415 replicate spaces and tabs at the beginning of the line when
418 number (yes|no) default no
419 relativenumber (yes|no) default no
421 whether absolute or relative line numbers are printed alongside
424 syntax name default yes
426 use syntax definition given (e.g. "c") or disable syntax
427 highlighting if no such definition exists (e.g :set syntax off)
431 show/hide special white space replacement symbols
433 newlines = [0|1] default 0
434 tabs = [0|1] default 0
435 spaces = [0|1] default 0
437 cursorline (yes|no) default no
439 highlight the line on which the cursor currently resides
441 colorcolumn number default 0
443 highlight the given column
445 horizon number default 32768 (32K)
447 how far back the lexer will look to synchronize parsing
449 theme name default dark-16.lua |Â solarized.lua (16 | 256 color)
451 use the given theme / color scheme for syntax highlighting
453 Commands taking a file name will use a simple file open dialog based
454 on the `vis-open` shell script and the
455 [`slmenu`](https://bitbucket.org/rafaelgg/slmenu) utility,
456 if given a file pattern or directory.
458 :e *.c # opens a menu with all C files
459 :e . # opens a menu with all files of the current directory
461 ### Configuring vis: visrc.lua, and environment variables
463 Settings and keymaps can be specified in a `visrc.lua` file, which will
464 be read by `vis` at runtime. An example `visrc.lua` file is installed
465 in `/usr/local/share/vis` by default. This file can be copied to
466 `$XDG_CONFIG_HOME/vis` (which defaults to `$HOME/.config/vis`) for
467 further configuration.
469 The environment variable `VIS_PATH` can be set to override the path
470 that `vis` will look for Lua support files as used for syntax
471 highlighting. `VIS_PATH` defaults (in this order) to
473 - The location of the `vis` binary
474 - `$XDG_CONFIG_HOME/vis`, where `$XDG_CONFIG_HOME` refers to
475 `$HOME/.config` if unset
476 - `/usr/local/share/vis`
479 The environment variable `VIS_THEME` can be set to specify the
480 theme used by `vis` e.g.
482 VIS_THEME=/path/to/your/theme.lua
485 ### Runtime Configurable Key Bindings
487 Vis supports run time key bindings via the `:{un,}map{,-window}` set of
488 commands. The basic syntax is:
490 :map <mode> <lhs> <rhs>
492 where mode is one of `normal`, `insert`, `replace`, `visual`,
493 `visual-line` or `operator-pending`. lhs refers to the key to map, rhs is
494 a key action or alias. An existing mapping can be overridden by appending
495 `!` to the map command.
497 Key mappings are always recursive, this means doing something like:
501 will not work because it will enter an endless loop. Instead vis uses
502 pseudo keys referred to as key actions which can be used to invoke a set
503 of available (see :help or <F1> for a list) editor functions. Hence the
504 correct thing to do would be:
506 :map! normal j 2<cursor-line-down>
508 Unmapping works as follows:
512 The commands suffixed with `-window` only affect the currently active window.
514 ### Layout Specific Key Bindings
516 Vis allows to set key equivalents for non-latin keyboard layouts. This
517 facilitates editing non-latin texts. The defined mappings take effect
518 in all non-input modes, i.e. everywhere except in insert and replace mode.
520 For example, the following maps the movement keys in Russian layout:
522 :langmap ролд hjkl
524 More generally the syntax of the `:langmap` command is:
526 :langmap <sequence of keys in your layout> <sequence of equivalent keys in latin layout>
528 If the key sequences have not the same length, the rest of the longer
529 sequence will be discarded.
531 ### Tab <-> Space conversion and Line endings \n vs \r\n
533 Tabs can optionally be expanded to a configurable number of spaces.
534 The first line ending in the file determines what will be inserted
535 upon a line break (defaults to \n).
537 ### Jump list and change list
539 A per window, file local jump list (navigate with `CTRL+O` and `CTRL+I`)
540 and change list (navigate with `g;` and `g,`) is supported. The jump
541 list is implemented as a fixed sized ring buffer.
545 The mouse is currently not used at all.
549 Some of the features of vim which will *not* be implemented:
551 - tabs / multiple workspaces / advanced window management
552 - file and directory browser
553 - support for file archives (tar, zip, ...)
554 - support for network protocols (ftp, http, ssh ...)
557 - GUIs (neither x11, motif, gtk, win32 ...) although the codebase
558 should make it easy to add them
560 - plugins (certainly not vimscript, if anything it should be lua based)
565 - internal spell checker
566 - compile time configurable features / `#ifdef` mess
568 Lua API for in process extension
569 ================================
571 Vis provides a simple Lua API for in process extension. At startup the
572 `visrc.lua` file is executed, this can be used to register a few event
573 callbacks which will be invoked from the editor core. While executing
574 these user scripts the editor core is blocked, hence it is intended for
575 simple short lived (configuration) tasks.
577 At this time there exists no API stability guarantees.
580 - `MODE_NORMAL`, `MODE_OPERATOR_PENDING`, `MODE_INSERT`, `MODE_REPLACE`, `MODE_VISUAL`, `MODE_VISUAL_LINE` mode constants
581 - `mode` current mode (one of the above constants)
582 - `lexers` LPeg lexer support module
589 - `win` currently focused window
590 - `windows()` iterator
592 - `info(msg)` display a single line message
593 - `message(msg)` display an arbitrarily long message
595 - `textobject_register(function)` register a Lua function as a text object, returns associated `id` or `-1`
596 - `textobject(id)` select/execute a text object
597 - `motion_register(function)` register a Lua function as a motion, returns associated `id` or `-1`
598 - `motion(id)` select/execute a motion
599 - `command_register(name, function(argv, force, win, cursor, range))` hook up a Lua function to `:name` command
600 - `map(mode, key, function)` map a Lua function to `key` in `mode`
602 - `content(pos, len)` or `content({start, finish})`
603 - `insert(pos, data)`
604 - `delete(pos, len)` or `delete({start, finish})`
607 - `lines[0..#lines+1]` array giving read/write access to lines
608 - `newlines` type of newlines either `"nl"` or `"crnl"`
609 - `size` current file size in bytes
612 - `cursors_iterator()`
613 - `cursors[1..#cursors]` array giving read access to all cursors
614 - `cursor` primary cursor
615 - `syntax` lexer name used for syntax highlighting or `nil`
617 - `line` (1 based), `col` (1 based)
619 - `pos` bytes from start of file (0 based)
620 - `number` one based index of cursor
621 - `selection` read/write access to selection represented as a `range`
622 - `range` denoted by absolute postions in bytes from the start of the file,
623 an invalid range is represented as `nil`
627 Most of the exposed objects are managed by the C core. Allthough there
628 is a simple object life time management mechanism in place, it is still
629 recommended to *not* let the Lua objects escape from the event handlers
630 (e.g. by assigning to global Lua variables).
632 Text management using a piece table/chain
633 =========================================
635 The core of this editor is a persistent data structure called a piece
636 table which supports all modifications in `O(m)`, where `m` is the number
637 of non-consecutive editing operations. This bound could be further
638 improved to `O(log m)` by use of a balanced search tree, however the
639 additional complexity doesn't seem to be worth it, for now.
641 The actual data is stored in buffers which are strictly append only.
642 There exist two types of buffers, one fixed-sized holding the original
643 file content and multiple append-only ones storing the modifications.
645 A text, i.e. a sequence of bytes, is represented as a double linked
646 list of pieces each with a pointer into a buffer and an associated
647 length. Pieces are never deleted but instead always kept around for
648 redo/undo support. A span is a range of pieces, consisting of a start
649 and end piece. Changes to the text are always performed by swapping
650 out an existing, possibly empty, span with a new one.
652 An empty document is represented by two special sentinel pieces which
660 Loading a file from disk is as simple as mmap(2)-ing it into a buffer,
661 creating a corresponding piece and adding it to the double linked list.
662 Hence loading a file is a constant time operation i.e. independent of
663 the actual file size (assuming the operating system uses demand paging).
665 /-+ --> +-----------------+ --> +-\
666 | | | I am an editor! | | |
667 \-+ <-- +-----------------+ <-- +-/
673 Inserting a junk of data amounts to appending the new content to a
674 modification buffer. Followed by the creation of new pieces. An insertion
675 in the middle of an existing piece requires the creation of 3 new pieces.
676 Two of them hold references to the text before respectively after the
677 insertion point. While the third one points to the newly added text.
679 /-+ --> +---------------+ --> +----------------+ --> +--+ --> +-\
680 | | | I am an editor| |which sucks less| |! | | |
681 \-+ <-- +---------------+ <-- +----------------+ <-- +--+ <-- +-/
684 modification buffer content: "which sucks less"
686 During this insertion operation the old span [3,3] has been replaced
687 by the new span [4,6]. Notice that the pieces in the old span were not
688 changed, therefore still point to their predecessors/successors, and can
689 thus be swapped back in.
691 If the insertion point happens to be at a piece boundary, the old span
692 is empty, and the new span only consists of the newly allocated piece.
697 Similarly a delete operation splits the pieces at appropriate places.
699 /-+ --> +-----+ --> +--+ --> +-\
701 \-+ <-- +-----+ <-- +--+ <-- +-/
704 Where the old span [4,5] got replaced by the new span [7,7]. The underlying
705 buffers remain unchanged.
710 Notice that the common case of appending text to a given piece is fast
711 since, the new data is simply appended to the buffer and the piece length
712 is increased accordingly. In order to keep the number of pieces down,
713 the least recently edited piece is cached and changes to it are done
714 in place (this is the only time buffers are modified in a non-append
715 only way). As a consequence they can not be undone.
720 Since the buffers are append only and the spans/pieces are never destroyed
721 undo/redo functionality is implemented by swapping the required spans/pieces
724 As illustrated above, each change to the text is recorded by an old and
725 a new span. An action consists of multiple changes which logically belong
726 to each other and should thus also be reverted together. For example
727 a search and replace operation is one action with possibly many changes
730 The text states can be marked by means of a snapshotting operation.
731 Snapshotting saves a new node to the history graph and creates a fresh
732 Action to which future changes will be appended until the next snapshot.
734 Actions make up the nodes of a connected digraph, each representing a state
735 of the file at some time during the current editing session. The edges of the
736 digraph represent state transitions that are supported by the editor. The edges
737 are implemented as four Action pointers (`prev`, `next`, `earlier`, and `later`).
739 The editor operations that execute the four aforementioned transitions
740 are `undo`, `redo`,`earlier`, and `later`, respectively. Undo and
741 redo behave in the traditional manner, changing the state one Action
742 at a time. Earlier and later, however, traverse the states in chronological
743 order, which may occasionally involve undoing and redoing many Actions at once.
748 Because we are working with a persistent data structure marks can be
749 represented as pointers into the underlying (append only) buffers.
750 To get the position of an existing mark it suffices to traverse the
751 list of pieces and perform a range query on the associated buffer
752 segments. This also nicely integrates with the undo/redo mechanism.
753 If a span is swapped out all contained marks (pointers) become invalid
754 because they are no longer reachable from the piece chain. Once an
755 action is undone, and the corresponding span swapped back in, the
756 marks become visible again. No explicit mark management is necessary.
761 The main advantage of the piece chain as described above is that all
762 operations are performed independent of the file size but instead linear
763 in the number of pieces i.e. editing operations. The original file buffer
764 never changes which means the `mmap(2)` can be performed read only which
765 makes optimal use of the operating system's virtual memory / paging system.
767 The maximum editable file size is limited by the amount of memory a process
768 is allowed to map into its virtual address space, this shouldn't be a problem
769 in practice. The whole process assumes that the file can be used as is.
770 In particular the editor assumes all input and the file itself is encoded
771 as UTF-8. Supporting other encodings would require conversion using `iconv(3)`
772 or similar upon loading and saving the document.
774 Similarly the editor has to cope with the fact that lines can be terminated
775 either by `\n` or `\r\n`. There is no conversion to a line based structure in
776 place. Instead the whole text is exposed as a sequence of bytes. All
777 addressing happens by means of zero based byte offsets from the start of
780 The main disadvantage of the piece chain data structure is that the text
781 is not stored contiguous in memory which makes seeking around somewhat
782 harder. This also implies that standard library calls like the `regex(3)`
783 functions can not be used as is. However this is the case for all but
784 the most simple data structures used in text editors.
786 Syntax Highlighting using Parsing Expression Grammars
787 =====================================================
789 [Parsing Expression Grammars](https://en.wikipedia.org/wiki/Parsing_expression_grammar)
790 (PEG) have the nice property that they are closed under composition.
791 In the context of an editor this is useful because lexers can be
792 embedded into each other, thus simplifying syntax highlighting
795 Vis reuses the [Lua](http://www.lua.org/) [LPeg](http://www.inf.puc-rio.br/~roberto/lpeg/)
796 based lexers from the [Scintillua](http://foicica.com/scintillua/) project.
801 This section contains some ideas for further architectural changes.
803 Event loop with asynchronous I/O
804 --------------------------------
806 The editor core should feature a proper main loop mechanism supporting
807 asynchronous non-blocking and always cancelable tasks which could be
808 used for all possibly long lived actions such as:
811 - `:substitute` and `:write` commands
813 - compiler integration (similar to vim's quick fix functionality)
815 Client/Server Architecture / RPC interface
816 ------------------------------------------
818 In principle it would be nice to follow a similar client/server approach
819 as [sam/samterm](http://sam.cat-v.org/) i.e. having the main editor as a
820 server and each window as a separate client process with communication
821 over a unix domain socket.
823 That way window management would be taken care of by dwm or dvtm and the
824 different client processes would still share common cut/paste registers
827 This would also enable a language agnostic plugin system.
829 Efficient Search and Replace
830 ----------------------------
832 Currently the editor copies the whole text to a contiguous memory block
833 and then uses the standard regex functions from libc. Clearly this is not
834 a satisfactory solution for large files.
836 The long term solution is to write our own regular expression engine or
837 modify an existing one to make use of the iterator API. This would allow
838 efficient search without having to double memory consumption.
840 The used regex engine should use a non-backtracking algorithm. Useful
843 - [Russ Cox's regex page](http://swtch.com/~rsc/regexp/)
844 - [TRE](https://github.com/laurikari/tre) as
845 [used by musl](http://git.musl-libc.org/cgit/musl/tree/src/regex)
846 which uses a parallel [TNFA matcher](http://laurikari.net/ville/spire2000-tnfa.ps)
847 - [Plan9's regex library](http://plan9.bell-labs.com/sources/plan9/sys/src/libregexp/)
848 which has its root in Rob Pike's sam text editor
849 - [RE2](https://github.com/google/re2) C++ regex library
854 Feel free to join `#vis-editor` on freenode to discuss development related issues.
856 A quick overview over the code structure to get you started:
858 File(s) | Description
859 ------------------- | -----------------------------------------------------
860 `array.[ch]` | dynamically growing array, can store arbitrarily sized objects
861 `buffer.[ch]` | dynamically growing buffer used for registers and macros
862 `config.def.h` | definition of default key bindings (mapping of key actions)
863 `lexers/` | Lua LPeg based lexers used for syntax highlighting
864 `main.c` | key action definitions, program entry point
865 `map.[ch]` | crit-bit tree based map supporting unique prefix lookups and ordered iteration, used to implement `:`-commands and run time key bindings
866 `register.[ch]` | register implementation, system clipboard integration via `vis-clipboard`
867 `ring-buffer.[ch]` | fixed size ring buffer used for the jump list
868 `sam.[ch]` | structural regular expression based command language
869 `text.[ch]` | low level text / marks / {un,re}do tree / piece table implementation
870 `text-motions.[ch]` | movement functions take a file position and return a new one
871 `text-objects.[ch]` | functions take a file position and return a file range
872 `text-regex.[ch]` | text search functionality, designated place for regex engine
873 `text-util.[ch]` | text related utility functions mostly dealing with file ranges
874 `ui-curses.[ch]` | a terminal / curses based user interface implementation
875 `ui.h` | abstract interface which has to be implemented by ui backends
876 `view.[ch]` | ui-independent viewport, shows part of a file, syntax highlighting, cursor placement, selection handling
877 `vis-cmds.c` | vi(m) `:`-command implementation
878 `vis-core.h` | internal header file, various structs for core editor primitives
879 `vis.c` | vi(m) specific editor frontend implementation
880 `vis.h` | vi(m) specific editor frontend library public API
881 `vis-lua.[ch]` | Lua bindings, exposing core vis APIs for in process extension
882 `vis-modes.c` | vi(m) mode switching, enter/leave event handling
883 `vis-motions.c` | vi(m) cursor motion implementations, uses `text-motions.h` internally
884 `vis-operators.c` | vi(m) operator implementation
885 `vis-prompt.c` | `:`, `/` and `?` prompt implemented as a regular file/window with custom key bindings
886 `vis-text-objects.c`| vi(m) text object implementations, uses `text-objects.h` internally
887 `visrc.lua` | Lua startup and configuration script
889 Testing infrastructure for the [low level core data structures]
890 (https://github.com/martanne/vis-test/tree/master/core), [vim compatibility]
891 (https://github.com/martanne/vis-test/tree/master/vim) and [vis specific features]
892 (https://github.com/martanne/vis-test/tree/master/vis) is in place, but
893 lacks proper test cases.