1 Vis a vim-like text editor
2 ==========================
4 Vis aims to be a modern, legacy free, simple yet efficient vim-like editor.
6 As an universal editor it has decent Unicode support (including double width
7 and combining characters) and should cope with arbitrary files including:
9 - large ones e.g. >500M SQL dumps or CSV exports
10 - single line ones e.g. minified JavaScript
11 - binary ones e.g. ELF files
13 Efficient syntax highlighting is provided using Parsing Expression Grammars
14 which can be conveniently expressed using Lua in form of LPeg.
16 The editor core is written in a reasonable amount of clean (your mileage
17 may vary), modern and legacy free C code enabling it to run in resource
18 constrained environments. The implementation should be easy to hack on
19 and encourage experimentation (e.g. native built in support for multiple
20 cursors). There also exists a Lua API for in process extensions.
22 Vis strives to be *simple* and focuses on its core task: efficient text
23 management. As an example the file open dialog is provided by an independent
24 utility. There exist plans to use a client/server architecture, delegating
25 window management to your windowing system or favorite terminal multiplexer.
27 The intention is *not* to be bug for bug compatible with vim, instead a
28 similar editing experience should be provided. The goal could thus be
29 summarized as "80% of vim's features implemented in roughly 1% of the code".
31 ![vis demo](https://raw.githubusercontent.com/martanne/vis/gh-pages/screencast.gif)
33 Getting started / Build instructions
34 ====================================
36 In order to build vis you will need a C99 compiler as well as:
38 * a C library, we recommend [musl](http://www.musl-libc.org/)
39 * [libcurses](http://www.gnu.org/software/ncurses/), preferably in the
40 wide-character version
41 * [libtermkey](http://www.leonerd.org.uk/code/libtermkey/)
42 * [lua](http://www.lua.org/) >= 5.2
43 * [LPeg](http://www.inf.puc-rio.br/~roberto/lpeg/) >= 0.12 (runtime
44 dependency required for syntax highlighting)
46 If you want a self contained statically linked binary you can try
47 to run `make standalone` which will attempt to download, compile
48 and install all of the above dependencies. `make local` will do
49 the same but only for libtermkey, lua and LPeg (i.e. the system
50 C and curses libraries are used).
52 To build a regular dynamically linked binary using the system
53 libraries, simply run `make` (possibly after adapting `config.mk`
54 to match your system).
59 The following section gives a quick overview over the currently
75 = (format using fmt(1))
77 Operators can be forced to work line wise by specifying `V`.
85 gj (display line down)
88 ^ (first non-blank of line)
89 g_ (last non-blank of line)
92 b (previous start of a word)
93 B (previous start of a WORD)
94 w (next start of a word)
95 W (next start of a WORD)
96 e (next end of a word)
97 E (next end of a WORD)
98 ge (previous end of a word)
99 gE (previous end of a WORD)
100 { (previous paragraph)
102 ( (previous sentence)
104 [[ (previous start of C-like function)
105 [] (previous end of C-like function)
106 ][ (next start of C-like function)
107 ]] (next end of C-like function)
109 g0 (begin of display line)
110 gm (middle of display line)
111 g$ (end of display line)
112 G (goto line or end of file)
114 n (repeat last search forward)
115 N (repeat last search backwards)
116 * (search word under cursor forwards)
117 # (search word under cursor backwards)
118 f{char} (to next occurrence of char to the right)
119 t{char} (till before next occurrence of char to the right)
120 F{char} (to next occurrence of char to the left)
121 T{char} (till before next occurrence of char to the left)
122 ; (repeat last to/till movement)
123 , (repeat last to/till movement but in opposite direction)
124 /{text} (to next match of text in forward direction)
125 ?{text} (to next match of text in backward direction)
127 An empty line is currently neither a word nor a WORD.
129 The semantics of a paragraph and a sentence is also not always 100%
132 Some of these commands do not work as in vim when prefixed with a
133 digit i.e. a multiplier. As an example in vim `3$` moves to the end
134 of the 3rd line down. However vis treats it as a move to the end of
135 current line which is repeated 3 times where the last two have no
140 All of the following text objects are implemented in an inner variant
141 (prefixed with `i`) and a normal variant (prefixed with `a`):
147 [,], (,), {,}, <,>, ", ', ` block enclosed by these symbols
149 For sentence and paragraph there is no difference between the
150 inner and normal variants.
152 Additionally the following text objects, which are not part of stock vim
155 ae entire file content
156 ie entire file content except for leading and trailing empty lines
157 af C-like function definition including immeadiately preceding comments
158 if C-like function definition only function body
160 il current line without leading and trailing white spaces
164 At the moment there exists a more or less functional insert, replace
165 and visual mode (in both line and character wise variants).
167 Visual block mode is not implemented and there exists no immediate
168 plan to do so. Instead vis has built in support for multiple cursors.
170 ### Multiple Cursors / Selections
172 vis supports multiple cursors with immediate visual feedback (unlike
173 in the visual block mode of vim where for example inserts only become
174 visible upon exit). There always exists one primary cursor, additional
175 ones can be created as needed.
177 To manipulate multiple cursors use in normal mode:
179 CTRL-K create a new cursor on the line above
180 CTRL-J create a new cursor on the line below
181 CTRL-P remove least recently added cursor
182 CTRL-N select word the cursor is currently over, switch to visual mode
183 CTRL-A try to align all cursor on the same column
184 ESC if a selection is active, clear it.
185 Otherwise dispose all but the primary cursor.
187 Visual mode was enhanced to recognize:
189 I create a cursor at the start of every selected line
190 A create a cursor at the end of every selected line
191 CTRL-N create new cursor and select next word matching current selection
192 CTRL-X clear (skip) current selection, but select next matching word
193 CTRL-P remove least recently added cursor
197 [a-z] general purpose marks
198 < start of the last selected visual area in current buffer
199 > end of the last selected visual area in current buffer
201 No marks across files are supported. Marks are not preserved over
206 Only the 26 lower case registers `[a-z]` and 1 additional default register
209 ### Undo/Redo and Repeat
211 The text is currently snapshotted whenever an operator is completed as
212 well as when insert or replace mode is left. Additionally a snapshot
213 is also taken if in insert or replace mode a certain idle time elapses.
215 Another idea is to snapshot based on the distance between two consecutive
216 editing operations (as they are likely unrelated and thus should be
217 individually reversible).
219 Besides the regular undo functionality, the key bindings `g+` and `g-`
220 traverse the history in chronological order. Further more the `:earlier`
221 and `:later` commands provide means to restore the text to an arbitrary
224 The repeat command `.` works for all operators and is able to repeat
225 the last insertion or replacement.
229 `[a-z]` are recoginized macro names, `q` starts a recording, `@` plays it back.
230 `@@` refers to the least recently recorded macro.
232 ### Command line prompt
234 At the `:`-command prompt only the following commands are recognized, any
235 valid unique prefix can be used:
238 :bdelete close all windows which display the same file as the current one
239 :edit replace current file with a new one or reload it from disk
240 :open open a new window
241 :qall close all windows, exit editor
242 :quit close currently focused window
243 :read insert content of another file at current cursor position
244 :split split window horizontally
245 :vsplit split window vertically
246 :new open an empty window, arrange horizontally
247 :vnew open an empty window, arrange vertically
248 :wq write changes then close window
249 :xit like :wq but write only when changes have been made
250 :write write current buffer content to file
251 :saveas save file under another name
252 :substitute search and replace currently implemented in terms of `sed(1)`
253 :! filter range through external command
254 :earlier revert to older text state
255 :later revert to newer text state
256 :set set the options below
260 set display width of a tab and number of spaces to use if
265 whether typed in tabs should be expanded to tabwidth spaces
269 replicate spaces and tabs at the beginning of the line when
273 relativenumber (yes|no)
275 whether absolute or relative line numbers are printed alongside
280 use syntax definition given (e.g. "c") or disable syntax
281 highlighting if no such definition exists (e.g :set syntax off)
283 show newlines=[1|0] tabs=[1|0] spaces=[0|1]
285 show/hide special white space replacement symbols
289 highlight the line on which the cursor currently resides
293 highlight the given column
297 use the given theme / color scheme for syntax highlighting
299 Each command can be prefixed with a range made up of a start and
300 an end position as in start,end. Valid position specifiers are:
302 . start of the current line
303 +n and -n start of the line relative to the current line
304 'm position of mark m
305 /pattern/ first match after current position
307 If only a start position without a command is given then the cursor
308 is moved to that position. Additionally the following ranges are
311 % the whole file, equivalent to 1,$
312 * the current selection, equivalent to '<,'>
314 History support, tab completion and wildcard expansion are other
315 worthwhile features. However implementing them inside the editor
318 ### Tab <-> Space conversion and Line endings \n vs \r\n
320 Tabs can optionally be expaned to a configurable number of spaces.
321 The first line ending in the file determines what will be inserted
322 upon a line break (defaults to \n).
324 ### Jump list and change list
326 A per window, file local jump list (navigate with `CTRL+O` and `CTRL+I`)
327 and change list (navigate with `g;` and `g,`) is supported. The jump
328 list is implemented as a fixed sized ring buffer.
332 The mouse is currently not used at all.
336 Some of the features of vim which will *not* be implemented:
338 - tabs / multiple workspaces / advanced window management
339 - file and directory browser
340 - support for file archives (tar, zip, ...)
341 - support for network protocols (ftp, http, ssh ...)
344 - GUIs (neither x11, motif, gtk, win32 ...) although the codebase
345 should make it easy to add them
347 - plugins (certainly not vimscript, if anything it should be lua based)
349 - ex mode (if you need a stream editor use `ssam(1)`
352 - internal spell checker
353 - compile time configurable features / `#ifdef` mess
356 Text management using a piece table/chain
357 =========================================
359 The core of this editor is a persistent data structure called a piece
360 table which supports all modifications in `O(m)`, where `m` is the number
361 of non-consecutive editing operations. This bound could be further
362 improved to `O(log m)` by use of a balanced search tree, however the
363 additional complexity doesn't seem to be worth it, for now.
365 The actual data is stored in buffers which are strictly append only.
366 There exist two types of buffers, one fixed-sized holding the original
367 file content and multiple append-only ones storing the modifications.
369 A text, i.e. a sequence of bytes, is represented as a double linked
370 list of pieces each with a pointer into a buffer and an associated
371 length. Pieces are never deleted but instead always kept around for
372 redo/undo support. A span is a range of pieces, consisting of a start
373 and end piece. Changes to the text are always performed by swapping
374 out an existing, possibly empty, span with a new one.
376 An empty document is represented by two special sentinel pieces which
384 Loading a file from disk is as simple as mmap(2)-ing it into a buffer,
385 creating a corresponding piece and adding it to the double linked list.
386 Hence loading a file is a constant time operation i.e. independent of
387 the actual file size (assuming the operating system uses demand paging).
389 /-+ --> +-----------------+ --> +-\
390 | | | I am an editor! | | |
391 \-+ <-- +-----------------+ <-- +-/
397 Inserting a junk of data amounts to appending the new content to a
398 modification buffer. Followed by the creation of new pieces. An insertion
399 in the middle of an existing piece requires the creation of 3 new pieces.
400 Two of them hold references to the text before respectively after the
401 insertion point. While the third one points to the newly added text.
403 /-+ --> +---------------+ --> +----------------+ --> +--+ --> +-\
404 | | | I am an editor| |which sucks less| |! | | |
405 \-+ <-- +---------------+ <-- +----------------+ <-- +--+ <-- +-/
408 modification buffer content: "which sucks less"
410 During this insertion operation the old span [3,3] has been replaced
411 by the new span [4,6]. Notice that the pieces in the old span were not
412 changed, therefore still point to their predecessors/successors, and can
413 thus be swapped back in.
415 If the insertion point happens to be at a piece boundary, the old span
416 is empty, and the new span only consists of the newly allocated piece.
421 Similarly a delete operation splits the pieces at appropriate places.
423 /-+ --> +-----+ --> +--+ --> +-\
425 \-+ <-- +-----+ <-- +--+ <-- +-/
428 Where the old span [4,5] got replaced by the new span [7,7]. The underlying
429 buffers remain unchanged.
434 Notice that the common case of appending text to a given piece is fast
435 since, the new data is simply appended to the buffer and the piece length
436 is increased accordingly. In order to keep the number of pieces down,
437 the least recently edited piece is cached and changes to it are done
438 in place (this is the only time buffers are modified in a non-append
439 only way). As a consequence they can not be undone.
444 Since the buffers are append only and the spans/pieces are never destroyed
445 undo/redo functionality is implemented by swapping the required spans/pieces
448 As illustrated above, each change to the text is recorded by an old and
449 a new span. An action consists of multiple changes which logically belong
450 to each other and should thus also be reverted together. For example
451 a search and replace operation is one action with possibly many changes
454 The text states can be marked by means of a snapshotting operation.
455 Snapshotting saves a new node to the history graph and creates a fresh
456 Action to which future changes will be appended until the next snapshot.
458 Actions make up the nodes of a connected digraph, each representing a state
459 of the file at some time during the current editing session. The edges of the
460 digraph represent state transitions that are supported by the editor. The edges
461 are implemented as four Action pointers (`prev`, `next`, `earlier`, and `later`).
463 The editor operations that execute the four aforementioned transitions
464 are `undo`, `redo`,`earlier`, and `later`, respectively. Undo and
465 redo behave in the traditional manner, changing the state one Action
466 at a time. Earlier and later, however, traverse the states in chronological
467 order, which may occasionally involve undoing and redoing many Actions at once.
472 Because we are working with a persistent data structure marks can be
473 represented as pointers into the underlying (append only) buffers.
474 To get the position of an existing mark it suffices to traverse the
475 list of pieces and perform a range query on the associated buffer
476 segments. This also nicely integrates with the undo/redo mechanism.
477 If a span is swapped out all contained marks (pointers) become invalid
478 because they are no longer reachable from the piece chain. Once an
479 action is undone, and the corresponding span swapped back in, the
480 marks become visible again. No explicit mark management is necessary.
485 The main advantage of the piece chain as described above is that all
486 operations are performed independent of the file size but instead linear
487 in the number of pieces i.e. editing operations. The original file buffer
488 never changes which means the `mmap(2)` can be performed read only which
489 makes optimal use of the operating system's virtual memory / paging system.
491 The maximum editable file size is limited by the amount of memory a process
492 is allowed to map into its virtual address space, this shouldn't be a problem
493 in practice. The whole process assumes that the file can be used as is.
494 In particular the editor assumes all input and the file itself is encoded
495 as UTF-8. Supporting other encodings would require conversion using `iconv(3)`
496 or similar upon loading and saving the document.
498 Similarly the editor has to cope with the fact that lines can be terminated
499 either by `\n` or `\r\n`. There is no conversion to a line based structure in
500 place. Instead the whole text is exposed as a sequence of bytes. All
501 addressing happens by means of zero based byte offsets from the start of
504 The main disadvantage of the piece chain data structure is that the text
505 is not stored contiguous in memory which makes seeking around somewhat
506 harder. This also implies that standard library calls like the `regex(3)`
507 functions can not be used as is. However this is the case for all but
508 the most simple data structures used in text editors.
510 Syntax Highlighting using Parsing Expression Grammars
511 =====================================================
513 [Parsing Expression Grammars](https://en.wikipedia.org/wiki/Parsing_expression_grammar)
514 (PEG) have the nice property that they are closed under composition.
515 In the context of an editor this is useful because lexers can be
516 embedded into each other, thus simplifying syntax highlighting
519 Vis reuses the [Lua](http://www.lua.org/) [LPeg](http://www.inf.puc-rio.br/~roberto/lpeg/)
520 based lexers from the [Scintillua](http://foicica.com/scintillua/) project.
525 This section contains some ideas for further architectural changes.
527 Event loop with asynchronous I/O
528 --------------------------------
530 The editor core should feature a proper main loop mechanism supporting
531 asynchronous non-blocking and always cancelable tasks which could be
532 used for all possibly long lived actions such as:
535 - `:substitute` and `:write` commands
537 - compiler integration (similar to vim's quick fix functionality)
539 Client/Server Architecture / RPC interface
540 ------------------------------------------
542 In principle it would be nice to follow a similar client/server approach
543 as [sam/samterm](http://sam.cat-v.org/) i.e. having the main editor as a
544 server and each window as a separate client process with communication
545 over a unix domain socket.
547 That way window management would be taken care of by dwm or dvtm and the
548 different client processes would still share common cut/paste registers
551 This would also enable a language agnostic plugin system.
553 Efficient Search and Replace
554 ----------------------------
556 Currently the editor copies the whole text to a contiguous memory block
557 and then uses the standard regex functions from libc. Clearly this is not
558 a satisfactory solution for large files.
560 The long term solution is to write our own regular expression engine or
561 modify an existing one to make use of the iterator API. This would allow
562 efficient search without having to double memory consumption.
564 The used regex engine should use a non-backtracking algorithm. Useful
567 - [Russ Cox's regex page](http://swtch.com/~rsc/regexp/)
568 - [TRE](https://github.com/laurikari/tre) as
569 [used by musl](http://git.musl-libc.org/cgit/musl/tree/src/regex)
570 which uses a parallel [TNFA matcher](http://laurikari.net/ville/spire2000-tnfa.ps)
571 - [Plan9's regex library](http://plan9.bell-labs.com/sources/plan9/sys/src/libregexp/)
572 which has its root in Rob Pike's sam text editor
573 - [RE2](https://github.com/google/re2) C++ regex library
578 A quick overview over the code structure to get you started:
580 File(s) | Description
581 ------------------- | -----------------------------------------------------
582 `text.[ch]` | low level text / marks / {un,re}do / piece table implementation
583 `text-motions.[ch]` | movement functions take a file position and return a new one
584 `text-objects.[ch]` | functions take a file position and return a file range
585 `text-regex.[ch]` | text search functionality, designated place for regex engine
586 `text-util.[ch]` | text related utility functions mostly dealing with file ranges
587 `view.[ch]` | ui-independent viewport, shows part of a file, syntax highlighting, cursor placement, selection handling
588 `ui.h` | abstract interface which has to be implemented by ui backends
589 `ui-curses.[ch]` | a terminal / curses based user interface implementation
590 `buffer.[ch]` | dynamically growing buffer used for registers and macros
591 `ring-buffer.[ch]` | fixed size ring buffer used for the jump list
592 `map.[ch]` | crit-bit tree based map supporting unique prefix lookups and ordered iteration. used to implement `:`-commands
593 `vis.h` | vi(m) specific editor frontend library public API
594 `vis.c` | vi(m) specific editor frontend implementation
595 `vis-core.h` | internal header file, various structs for core editor primitives
596 `vis-cmds.c` | vi(m) `:`-command implementation
597 `vis-modes.c` | vi(m) mode switching, enter/leave event handling
598 `vis-motions.c` | vi(m) cursor motion implementation
599 `vis-operators.c` | vi(m) operator implementation
600 `vis-lua.c` | Lua bindings, exposing core vis APIs for in process extension
601 `main.c` | key action definitions, program entry point
602 `config.def.h` | definition of default key bindings (mapping of key actions)
603 `visrc.lua` | Lua startup and configuration script
604 `lexers/` | Lua LPeg based lexers used for syntax highlighting
606 Testing infrastructure for the [low level text manipulation routines]
607 (https://github.com/martanne/vis/tree/test/test/text), [vim compatibility]
608 (https://github.com/martanne/vis/tree/test/test/vim) and [vis specific features]
609 (https://github.com/martanne/vis/tree/test/test/vis) is in place, but
610 lacks proper test cases.