1 Vis a vim-like text editor
2 ==========================
4 Vis aims to be a modern, legacy free, simple yet efficient vim-like editor.
6 As an universal editor it has decent Unicode support (including double width
7 and combining characters) and should cope with arbitrary files including:
9 - large (up to a few Gigabytes) ones including
10 - Wikipedia/OpenStreetMap XML / SQL / CVS dumps
11 - amalgamated source trees (e.g. SQLite)
12 - single line ones e.g. minified JavaScript
13 - binary ones e.g. ELF files
15 Efficient syntax highlighting is provided using Parsing Expression Grammars
16 which can be conveniently expressed using Lua in form of LPeg.
18 The editor core is written in a reasonable amount of clean (your mileage
19 may vary), modern and legacy free C code enabling it to run in resource
20 constrained environments. The implementation should be easy to hack on
21 and encourage experimentation (e.g. native built in support for multiple
22 cursors). There also exists a Lua API for in process extensions.
24 Vis strives to be *simple* and focuses on its core task: efficient text
25 management. As an example the file open dialog is provided by an independent
26 utility. There exist plans to use a client/server architecture, delegating
27 window management to your windowing system or favorite terminal multiplexer.
29 The intention is *not* to be bug for bug compatible with vim, instead a
30 similar editing experience should be provided. The goal could thus be
31 summarized as "80% of vim's features implemented in roughly 1% of the code".
33 ![vis demo](https://raw.githubusercontent.com/martanne/vis/gh-pages/screencast.gif)
35 Getting started / Build instructions
36 ====================================
38 In order to build vis you will need a C99 compiler as well as:
40 * a C library, we recommend [musl](http://www.musl-libc.org/)
41 * [libcurses](http://www.gnu.org/software/ncurses/), preferably in the
42 wide-character version
43 * [libtermkey](http://www.leonerd.org.uk/code/libtermkey/)
44 * [lua](http://www.lua.org/) >= 5.2 (optional)
45 * [LPeg](http://www.inf.puc-rio.br/~roberto/lpeg/) >= 0.12
46 (optional runtime dependency required for syntax highlighting)
48 Assuming these dependencies are met, execute:
51 $ VIS_PATH=. ./vis config.h
53 By default the `configure` script will try to auto detect support for
54 Lua. See `configure --help` for a list of supported options. You can
55 also manually tweak the generated `config.mk` file.
57 On Linux based systems `make standalone` will attempt to download,
58 compile and install all of the above dependencies into a subfolder
59 inorder to build a self contained statically linked binary.
61 `make local` will do the same but only for libtermkey, Lua and LPeg
62 (i.e. the system C and curses libraries are used).
67 The following section gives a quick overview over the currently
83 = (format using fmt(1))
85 Operators can be forced to work line wise by specifying `V`.
93 gj (display line down)
96 ^ (first non-blank of line)
97 g_ (last non-blank of line)
100 b (previous start of a word)
101 B (previous start of a WORD)
102 w (next start of a word)
103 W (next start of a WORD)
104 e (next end of a word)
105 E (next end of a WORD)
106 ge (previous end of a word)
107 gE (previous end of a WORD)
108 { (previous paragraph)
110 ( (previous sentence)
112 [[ (previous start of C-like function)
113 [] (previous end of C-like function)
114 ][ (next start of C-like function)
115 ]] (next end of C-like function)
117 g0 (begin of display line)
118 gm (middle of display line)
119 g$ (end of display line)
120 G (goto line or end of file)
122 n (repeat last search forward)
123 N (repeat last search backwards)
124 H (goto top/home line of window)
125 M (goto middle line of window)
126 L (goto bottom/last line of window)
127 * (search word under cursor forwards)
128 # (search word under cursor backwards)
129 f{char} (to next occurrence of char to the right)
130 t{char} (till before next occurrence of char to the right)
131 F{char} (to next occurrence of char to the left)
132 T{char} (till before next occurrence of char to the left)
133 ; (repeat last to/till movement)
134 , (repeat last to/till movement but in opposite direction)
135 /{text} (to next match of text in forward direction)
136 ?{text} (to next match of text in backward direction)
138 '{mark} (go to start of line containing mark)
140 An empty line is currently neither a word nor a WORD.
142 Some of these commands do not work as in vim when prefixed with a
143 digit i.e. a multiplier. As an example in vim `3$` moves to the end
144 of the 3rd line down. However vis treats it as a move to the end of
145 current line which is repeated 3 times where the last two have no
150 All of the following text objects are implemented in an inner variant
151 (prefixed with `i`) and a normal variant (prefixed with `a`):
157 [,], (,), {,}, <,>, ", ', ` block enclosed by these symbols
159 For sentence and paragraph there is no difference between the
160 inner and normal variants.
162 gn matches the last used search term in forward direction
163 gN matches the last used search term in backward direction
165 Additionally the following text objects, which are not part of stock vim
168 ae entire file content
169 ie entire file content except for leading and trailing empty lines
170 af C-like function definition including immediately preceding comments
171 if C-like function definition only function body
173 il current line without leading and trailing white spaces
177 Vis implements more or less functional normal, operator-pending, insert,
178 replace and visual (in both line and character wise variants) modes.
180 Visual block mode is not implemented and there exists no immediate
181 plan to do so. Instead vis has built in support for multiple cursors.
183 Command mode is implemented as a regular file. Use the full power of the
184 editor to edit your commands / search terms.
186 Ex mode is deliberately not implemented, use `ssam(1)` if you need a
189 ### Multiple Cursors / Selections
191 vis supports multiple cursors with immediate visual feedback (unlike
192 in the visual block mode of vim where for example inserts only become
193 visible upon exit). There always exists one primary cursor located
194 within the current view port. Additional cursors ones can be created
195 as needed. If more than one cursor exists, the primary one is blinking.
197 To manipulate multiple cursors use in normal mode:
199 CTRL-K create a new cursor on the line above
200 CTRL-J create a new cursor on the line below
201 CTRL-P remove primary cursor
202 CTRL-N select word the cursor is currently over, switch to visual mode
203 CTRL-U make the previous cursor primary
204 CTRL-D make the next cursor primary
205 TAB try to align all cursor on the same column
206 ESC if a selection is active, clear it.
207 Otherwise dispose all but the primary cursor.
209 Visual mode was enhanced to recognize:
211 I create a cursor at the start of every selected line
212 A create a cursor at the end of every selected line
213 CTRL-N create new cursor and select next word matching current selection
214 CTRL-X clear (skip) current selection, but select next matching word
215 CTRL-P remove primary cursor
216 CTRL-U make the previous cursor primary
217 CTRL-D make the next cursor primary
219 In insert/replace mode
221 S-Tab aligns all cursors by inserting spaces
225 [a-z] general purpose marks
226 < start of the last selected visual area in current buffer
227 > end of the last selected visual area in current buffer
229 No marks across files are supported. Marks are not preserved over
234 Supported registers include:
236 "a-"z general purpose registers
237 "A-"Z append to corresponding general purpose register
238 "*, "+ system clipboard integration via shell scripts vis-{copy,paste}
242 "_ black hole (/dev/null) register
244 If no explicit register is specified a default register is used.
246 ### Undo/Redo and Repeat
248 The text is currently snapshotted whenever an operator is completed as
249 well as when insert or replace mode is left. Additionally a snapshot
250 is also taken if in insert or replace mode a certain idle time elapses.
252 Another idea is to snapshot based on the distance between two consecutive
253 editing operations (as they are likely unrelated and thus should be
254 individually reversible).
256 Besides the regular undo functionality, the key bindings `g+` and `g-`
257 traverse the history in chronological order. Further more the `:earlier`
258 and `:later` commands provide means to restore the text to an arbitrary
261 The repeat command `.` works for all operators and is able to repeat
262 the last insertion or replacement.
266 The general purpose registers `[a-z]` can be used to record macros. Use
267 one of `[A-Z]` to append to an existing macro. `q` starts a recording,
268 `@` plays it back. `@@` refers to the least recently recorded macro.
269 `@:` repeats the last :-command. `@/` is equivalent to `n` in normal mode.
271 ### Command line prompt
273 At the `:`-command prompt only the following commands are recognized, any
274 valid unique prefix can be used:
277 :bdelete close all windows which display the same file as the current one
278 :edit replace current file with a new one or reload it from disk
279 :open open a new window
280 :qall close all windows, exit editor
281 :quit close currently focused window
282 :read insert content of another file at current cursor position
283 :split split window horizontally
284 :vsplit split window vertically
285 :new open an empty window, arrange horizontally
286 :vnew open an empty window, arrange vertically
287 :wq write changes then close window
288 :xit like :wq but write only when changes have been made
289 :write write current buffer content to file
290 :saveas save file under another name
291 :substitute search and replace currently implemented in terms of `sed(1)`
292 :earlier revert to older text state
293 :later revert to newer text state
294 :map add a global key mapping
295 :unmap remove a global key mapping
296 :map-window add a window local key mapping
297 :unmap-window remove a window local key mapping
298 :langmap set key equivalents for layout specific key mappings
299 :! filter range through external command
300 :| pipe range to external command and display output in a new window
301 :set set the options below
303 tabwidth [1-8] default 8
305 set display width of a tab and number of spaces to use if
308 expandtab (yes|no) default no
310 whether typed in tabs should be expanded to tabwidth spaces
312 autoindent (yes|no) default no
314 replicate spaces and tabs at the beginning of the line when
317 number (yes|no) default no
318 relativenumber (yes|no) default no
320 whether absolute or relative line numbers are printed alongside
323 syntax name default yes
325 use syntax definition given (e.g. "c") or disable syntax
326 highlighting if no such definition exists (e.g :set syntax off)
330 show/hide special white space replacement symbols
332 newlines = [0|1] default 0
333 tabs = [0|1] default 0
334 spaces = [0|1] default 0
336 cursorline (yes|no) default no
338 highlight the line on which the cursor currently resides
340 colorcolumn number default 0
342 highlight the given column
344 theme name default dark-16.lua |Â solarized.lua (16 | 256 color)
346 use the given theme / color scheme for syntax highlighting
348 Each command can be prefixed with a range made up of a start and
349 an end position as in start,end. Valid position specifiers are:
351 . start of the current line
352 +n and -n start of the line relative to the current line
353 'm position of mark m
354 /pattern/ first match after current position
356 If only a start position without a command is given then the cursor
357 is moved to that position. Additionally the following ranges are
360 % the whole file, equivalent to 1,$
361 * the current selection, equivalent to '<,'>
363 History support, tab completion and wildcard expansion are other
364 worthwhile features. However implementing them inside the editor feels
365 wrong. For now you can use the `:edit` command with a pattern or a
371 vis will call the `vis-open` script which invokes dmenu or slmenu
372 with the files corresponding to the pattern. The file you select in
373 dmenu/slmenu will be opened in vis.
375 ### Runtime Configurable Key Bindings
377 Vis supports run time key bindings via the `:{un,}map{,-window}` set of
378 commands. The basic syntax is:
380 :map <mode> <lhs> <rhs>
382 where mode is one of `normal`, `insert`, `replace`, `visual`,
383 `visual-line` or `operator-pending`. lhs refers to the key to map, rhs is
384 a key action or alias. An existing mapping can be overridden by appending
385 `!` to the map command.
387 Key mappings are always recursive, this means doing something like:
391 will not work because it will enter an endless loop. Instead vis uses
392 pseudo keys referred to as key actions which can be used to invoke a set
393 of available (see :help or <F1> for a list) editor functions. Hence the
394 correct thing to do would be:
396 :map! normal j 2<cursor-line-down>
398 Unmapping works as follows:
402 The commands suffixed with `-window` only affect the currently active window.
404 ### Layout Specific Key Bindings
406 Vis allows to set key equivalents for non-latin keyboard layouts. This
407 facilitates editing non-latin texts. The defined mappings take effect
408 in all non-input modes, i.e. everywhere except in insert and replace mode.
410 For example, the following maps the movement keys in Russian layout:
412 :langmap ролд hjkl
414 More generally the syntax of the `:langmap` command is:
416 :langmap <sequence of keys in your layout> <sequence of equivalent keys in latin layout>
418 If the key sequences have not the same length, the rest of the longer
419 sequence will be discarded.
421 ### Tab <-> Space conversion and Line endings \n vs \r\n
423 Tabs can optionally be expanded to a configurable number of spaces.
424 The first line ending in the file determines what will be inserted
425 upon a line break (defaults to \n).
427 ### Jump list and change list
429 A per window, file local jump list (navigate with `CTRL+O` and `CTRL+I`)
430 and change list (navigate with `g;` and `g,`) is supported. The jump
431 list is implemented as a fixed sized ring buffer.
435 The mouse is currently not used at all.
439 Some of the features of vim which will *not* be implemented:
441 - tabs / multiple workspaces / advanced window management
442 - file and directory browser
443 - support for file archives (tar, zip, ...)
444 - support for network protocols (ftp, http, ssh ...)
447 - GUIs (neither x11, motif, gtk, win32 ...) although the codebase
448 should make it easy to add them
450 - plugins (certainly not vimscript, if anything it should be lua based)
452 - ex mode (if you need a stream editor use `ssam(1)`
455 - internal spell checker
456 - compile time configurable features / `#ifdef` mess
458 Lua API for in process extension
459 ================================
461 Vis provides a simple Lua API for in process extension. At startup the
462 `visrc.lua` file is executed, this can be used to register a few event
463 callbacks which will be invoked from the editor core. While executing
464 these user scripts the editor core is blocked, hence it is intended for
465 simple short lived (configuration) tasks.
467 At this time there exists no API stability guarantees.
470 - `MODE_NORMAL`, `MODE_OPERATOR_PENDING`, `MODE_INSERT`, `MODE_REPLACE`, `MODE_VISUAL`, `MODE_VISUAL_LINE` mode constants
471 - `lexers` LPeg lexer support module
478 - `windows()` iterator
482 - `textobject_register(function)` register a Lua function as a text object, returns associated `id` or `-1`
483 - `textobject(id)` select/execute a text object
484 - `motion_register(function)` register a Lua function as a motion, returns associated `id` or `-1`
485 - `motion(id)` select/execute a motion
486 - `map(mode, key, function)` map a Lua function to `key` in `mode`
488 - `content(pos, len)`
489 - `insert(pos, data)`
493 - `lines[0..#lines+1]` array giving read/write access to lines
496 - `syntax` lexer name used for syntax highlighting or `nil`
498 - `line` (1 based), `col` (0 based)
500 - `pos` bytes from start of file (0 based)
502 Most of the exposed objects are managed by the C core. Allthough there
503 is a simple object life time management mechanism in place, it is still
504 recommended to *not* let the Lua objects escape from the event handlers
505 (e.g. by assigning to global Lua variables).
507 Text management using a piece table/chain
508 =========================================
510 The core of this editor is a persistent data structure called a piece
511 table which supports all modifications in `O(m)`, where `m` is the number
512 of non-consecutive editing operations. This bound could be further
513 improved to `O(log m)` by use of a balanced search tree, however the
514 additional complexity doesn't seem to be worth it, for now.
516 The actual data is stored in buffers which are strictly append only.
517 There exist two types of buffers, one fixed-sized holding the original
518 file content and multiple append-only ones storing the modifications.
520 A text, i.e. a sequence of bytes, is represented as a double linked
521 list of pieces each with a pointer into a buffer and an associated
522 length. Pieces are never deleted but instead always kept around for
523 redo/undo support. A span is a range of pieces, consisting of a start
524 and end piece. Changes to the text are always performed by swapping
525 out an existing, possibly empty, span with a new one.
527 An empty document is represented by two special sentinel pieces which
535 Loading a file from disk is as simple as mmap(2)-ing it into a buffer,
536 creating a corresponding piece and adding it to the double linked list.
537 Hence loading a file is a constant time operation i.e. independent of
538 the actual file size (assuming the operating system uses demand paging).
540 /-+ --> +-----------------+ --> +-\
541 | | | I am an editor! | | |
542 \-+ <-- +-----------------+ <-- +-/
548 Inserting a junk of data amounts to appending the new content to a
549 modification buffer. Followed by the creation of new pieces. An insertion
550 in the middle of an existing piece requires the creation of 3 new pieces.
551 Two of them hold references to the text before respectively after the
552 insertion point. While the third one points to the newly added text.
554 /-+ --> +---------------+ --> +----------------+ --> +--+ --> +-\
555 | | | I am an editor| |which sucks less| |! | | |
556 \-+ <-- +---------------+ <-- +----------------+ <-- +--+ <-- +-/
559 modification buffer content: "which sucks less"
561 During this insertion operation the old span [3,3] has been replaced
562 by the new span [4,6]. Notice that the pieces in the old span were not
563 changed, therefore still point to their predecessors/successors, and can
564 thus be swapped back in.
566 If the insertion point happens to be at a piece boundary, the old span
567 is empty, and the new span only consists of the newly allocated piece.
572 Similarly a delete operation splits the pieces at appropriate places.
574 /-+ --> +-----+ --> +--+ --> +-\
576 \-+ <-- +-----+ <-- +--+ <-- +-/
579 Where the old span [4,5] got replaced by the new span [7,7]. The underlying
580 buffers remain unchanged.
585 Notice that the common case of appending text to a given piece is fast
586 since, the new data is simply appended to the buffer and the piece length
587 is increased accordingly. In order to keep the number of pieces down,
588 the least recently edited piece is cached and changes to it are done
589 in place (this is the only time buffers are modified in a non-append
590 only way). As a consequence they can not be undone.
595 Since the buffers are append only and the spans/pieces are never destroyed
596 undo/redo functionality is implemented by swapping the required spans/pieces
599 As illustrated above, each change to the text is recorded by an old and
600 a new span. An action consists of multiple changes which logically belong
601 to each other and should thus also be reverted together. For example
602 a search and replace operation is one action with possibly many changes
605 The text states can be marked by means of a snapshotting operation.
606 Snapshotting saves a new node to the history graph and creates a fresh
607 Action to which future changes will be appended until the next snapshot.
609 Actions make up the nodes of a connected digraph, each representing a state
610 of the file at some time during the current editing session. The edges of the
611 digraph represent state transitions that are supported by the editor. The edges
612 are implemented as four Action pointers (`prev`, `next`, `earlier`, and `later`).
614 The editor operations that execute the four aforementioned transitions
615 are `undo`, `redo`,`earlier`, and `later`, respectively. Undo and
616 redo behave in the traditional manner, changing the state one Action
617 at a time. Earlier and later, however, traverse the states in chronological
618 order, which may occasionally involve undoing and redoing many Actions at once.
623 Because we are working with a persistent data structure marks can be
624 represented as pointers into the underlying (append only) buffers.
625 To get the position of an existing mark it suffices to traverse the
626 list of pieces and perform a range query on the associated buffer
627 segments. This also nicely integrates with the undo/redo mechanism.
628 If a span is swapped out all contained marks (pointers) become invalid
629 because they are no longer reachable from the piece chain. Once an
630 action is undone, and the corresponding span swapped back in, the
631 marks become visible again. No explicit mark management is necessary.
636 The main advantage of the piece chain as described above is that all
637 operations are performed independent of the file size but instead linear
638 in the number of pieces i.e. editing operations. The original file buffer
639 never changes which means the `mmap(2)` can be performed read only which
640 makes optimal use of the operating system's virtual memory / paging system.
642 The maximum editable file size is limited by the amount of memory a process
643 is allowed to map into its virtual address space, this shouldn't be a problem
644 in practice. The whole process assumes that the file can be used as is.
645 In particular the editor assumes all input and the file itself is encoded
646 as UTF-8. Supporting other encodings would require conversion using `iconv(3)`
647 or similar upon loading and saving the document.
649 Similarly the editor has to cope with the fact that lines can be terminated
650 either by `\n` or `\r\n`. There is no conversion to a line based structure in
651 place. Instead the whole text is exposed as a sequence of bytes. All
652 addressing happens by means of zero based byte offsets from the start of
655 The main disadvantage of the piece chain data structure is that the text
656 is not stored contiguous in memory which makes seeking around somewhat
657 harder. This also implies that standard library calls like the `regex(3)`
658 functions can not be used as is. However this is the case for all but
659 the most simple data structures used in text editors.
661 Syntax Highlighting using Parsing Expression Grammars
662 =====================================================
664 [Parsing Expression Grammars](https://en.wikipedia.org/wiki/Parsing_expression_grammar)
665 (PEG) have the nice property that they are closed under composition.
666 In the context of an editor this is useful because lexers can be
667 embedded into each other, thus simplifying syntax highlighting
670 Vis reuses the [Lua](http://www.lua.org/) [LPeg](http://www.inf.puc-rio.br/~roberto/lpeg/)
671 based lexers from the [Scintillua](http://foicica.com/scintillua/) project.
676 This section contains some ideas for further architectural changes.
678 Event loop with asynchronous I/O
679 --------------------------------
681 The editor core should feature a proper main loop mechanism supporting
682 asynchronous non-blocking and always cancelable tasks which could be
683 used for all possibly long lived actions such as:
686 - `:substitute` and `:write` commands
688 - compiler integration (similar to vim's quick fix functionality)
690 Client/Server Architecture / RPC interface
691 ------------------------------------------
693 In principle it would be nice to follow a similar client/server approach
694 as [sam/samterm](http://sam.cat-v.org/) i.e. having the main editor as a
695 server and each window as a separate client process with communication
696 over a unix domain socket.
698 That way window management would be taken care of by dwm or dvtm and the
699 different client processes would still share common cut/paste registers
702 This would also enable a language agnostic plugin system.
704 Efficient Search and Replace
705 ----------------------------
707 Currently the editor copies the whole text to a contiguous memory block
708 and then uses the standard regex functions from libc. Clearly this is not
709 a satisfactory solution for large files.
711 The long term solution is to write our own regular expression engine or
712 modify an existing one to make use of the iterator API. This would allow
713 efficient search without having to double memory consumption.
715 The used regex engine should use a non-backtracking algorithm. Useful
718 - [Russ Cox's regex page](http://swtch.com/~rsc/regexp/)
719 - [TRE](https://github.com/laurikari/tre) as
720 [used by musl](http://git.musl-libc.org/cgit/musl/tree/src/regex)
721 which uses a parallel [TNFA matcher](http://laurikari.net/ville/spire2000-tnfa.ps)
722 - [Plan9's regex library](http://plan9.bell-labs.com/sources/plan9/sys/src/libregexp/)
723 which has its root in Rob Pike's sam text editor
724 - [RE2](https://github.com/google/re2) C++ regex library
729 A quick overview over the code structure to get you started:
731 File(s) | Description
732 ------------------- | -----------------------------------------------------
733 `text.[ch]` | low level text / marks / {un,re}do / piece table implementation
734 `text-motions.[ch]` | movement functions take a file position and return a new one
735 `text-objects.[ch]` | functions take a file position and return a file range
736 `text-regex.[ch]` | text search functionality, designated place for regex engine
737 `text-util.[ch]` | text related utility functions mostly dealing with file ranges
738 `view.[ch]` | ui-independent viewport, shows part of a file, syntax highlighting, cursor placement, selection handling
739 `ui.h` | abstract interface which has to be implemented by ui backends
740 `ui-curses.[ch]` | a terminal / curses based user interface implementation
741 `buffer.[ch]` | dynamically growing buffer used for registers and macros
742 `ring-buffer.[ch]` | fixed size ring buffer used for the jump list
743 `map.[ch]` | crit-bit tree based map supporting unique prefix lookups and ordered iteration. used to implement `:`-commands
744 `vis.h` | vi(m) specific editor frontend library public API
745 `vis.c` | vi(m) specific editor frontend implementation
746 `vis-core.h` | internal header file, various structs for core editor primitives
747 `vis-cmds.c` | vi(m) `:`-command implementation
748 `vis-modes.c` | vi(m) mode switching, enter/leave event handling
749 `vis-motions.c` | vi(m) cursor motion implementation
750 `vis-operators.c` | vi(m) operator implementation
751 `vis-lua.c` | Lua bindings, exposing core vis APIs for in process extension
752 `main.c` | key action definitions, program entry point
753 `config.def.h` | definition of default key bindings (mapping of key actions)
754 `visrc.lua` | Lua startup and configuration script
755 `lexers/` | Lua LPeg based lexers used for syntax highlighting
757 Testing infrastructure for the [low level text manipulation routines]
758 (https://github.com/martanne/vis/tree/test/test/text), [vim compatibility]
759 (https://github.com/martanne/vis/tree/test/test/vim) and [vis specific features]
760 (https://github.com/martanne/vis/tree/test/test/vis) is in place, but
761 lacks proper test cases.