1 Vis a vim-like text editor
2 ==========================
4 [![Linux Build Status](https://travis-ci.org/martanne/vis.svg?branch=master)](https://travis-ci.org/martanne/vis)
5 [![Cygwin Build Status](https://ci.appveyor.com/api/projects/status/61059w4jpdnb77ft/branch/master?svg=true)](https://ci.appveyor.com/project/martanne/vis/branch/master)
6 [![Coverity Scan Build Status](https://scan.coverity.com/projects/3939/badge.svg)](https://scan.coverity.com/projects/3939)
7 [![codecov](https://codecov.io/gh/martanne/vis/branch/master/graph/badge.svg)](https://codecov.io/gh/martanne/vis)
8 [![#vis-editor on freenode](https://img.shields.io/badge/IRC-%23vis--editor-blue.svg)](irc://irc.freenode.net/vis-editor)
10 Vis aims to be a modern, legacy free, simple yet efficient editor
11 combining the strengths of both vi(m) and sam.
13 It extends vi's modal editing with built-in support for multiple
14 cursors/selections and combines it with [sam's](http://sam.cat-v.org/)
15 [structural regular expression](http://doc.cat-v.org/bell_labs/structural_regexps/)
16 based [command language](http://doc.cat-v.org/bell_labs/sam_lang_tutorial/).
18 As an universal editor it has decent Unicode support (including double width
19 and combining characters) and should cope with arbitrary files including:
21 - large (up to a few Gigabytes) ones including
22 - Wikipedia/OpenStreetMap XML / SQL / CSV dumps
23 - amalgamated source trees (e.g. SQLite)
24 - single line ones e.g. minified JavaScript
25 - binary ones e.g. ELF files
27 Efficient syntax highlighting is provided using
28 [Parsing Expression Grammars](https://en.wikipedia.org/wiki/Parsing_expression_grammar)
29 which can be conveniently expressed using [Lua](http://www.lua.org/)
30 in the form of [LPeg](http://www.inf.puc-rio.br/~roberto/lpeg/).
32 The editor core is written in a reasonable amount of clean (your mileage
33 may vary), modern and legacy free C code enabling it to run in resource
34 constrained environments. The implementation should be easy to hack on
35 and encourage experimentation. There also exists a Lua API for in process
38 Vis strives to be *simple* and focuses on its core task: efficient text
39 management. As an example the file open dialog is provided by an independent
40 utility. There exist plans to use a client/server architecture, delegating
41 window management to your windowing system or favorite terminal multiplexer.
43 The intention is *not* to be bug for bug compatible with vi(m), instead
44 we aim to provide more powerful editing features based on an elegant design
45 and clean implementation.
47 [![vis demo](https://asciinema.org/a/41361.png)](https://asciinema.org/a/41361)
49 Getting started / Build instructions
50 ====================================
52 In order to build vis you will need a
53 [C99](http://www.open-std.org/jtc1/sc22/wg14/www/docs/n1256.pdf)
54 compiler, a [POSIX.1-2008](http://pubs.opengroup.org/onlinepubs/9699919799/)
55 compatible environment as well as:
57 * [libcurses](http://www.gnu.org/software/ncurses/), preferably in the
58 wide-character version
59 * [libtermkey](http://www.leonerd.org.uk/code/libtermkey/)
60 * [Lua](http://www.lua.org/) >= 5.2 (optional)
61 * [LPeg](http://www.inf.puc-rio.br/~roberto/lpeg/) >= 0.12
62 (optional runtime dependency required for syntax highlighting)
64 Assuming these dependencies are met, execute:
66 $ ./configure && make && sudo make install
68 By default the `configure` script will try to auto detect support for
69 Lua. See `configure --help` for a list of supported options. You can
70 also manually tweak the generated `config.mk` file.
72 Or simply use one of the distribution provided packages:
74 * [ArchLinux](http://www.archlinux.org/packages/?q=vis)
75 * [Alpine Linux](https://pkgs.alpinelinux.org/packages?name=vis)
76 * [NixOS](https://github.com/NixOS/nixpkgs/blob/master/pkgs/applications/editors/vis/default.nix)
77 * [Source Mage GNU/Linux](http://download.sourcemage.org/grimoire/codex/test/editors/vis)
78 * [Void Linux](https://github.com/voidlinux/void-packages/tree/master/srcpkgs/vis)
79 * [pkgsrc](http://pkgsrc.se/wip/vis-editor)
84 End user documentation can be found in the [`vis(1)` manual page](http://martanne.github.io/vis/man/vis.1.html).
86 [Lua API Documentation](http://martanne.github.io/vis/doc/) is also available.
91 Some features which will *not* be implemented:
93 - tabs / multiple workspaces / advanced window management
94 - file and directory browser
95 - support for file archives (tar, zip, ...)
96 - support for network protocols (ftp, http, ssh ...)
99 - GUIs (neither x11, motif, gtk, win32 ...) although the codebase
100 should make it easy to add them
103 - ex mode, we have more elegant structural regexp
106 - internal spell checker
107 - lots of compile time configurable features / `#ifdef` mess
109 Text management using a piece table/chain
110 =========================================
112 The core of this editor is a persistent data structure called a piece
113 table which supports all modifications in `O(m)`, where `m` is the number
114 of non-consecutive editing operations. This bound could be further
115 improved to `O(log m)` by use of a balanced search tree, however the
116 additional complexity doesn't seem to be worth it, for now.
118 The actual data is stored in buffers which are strictly append only.
119 There exist two types of buffers, one fixed-sized holding the original
120 file content and multiple append-only ones storing the modifications.
122 A text, i.e. a sequence of bytes, is represented as a double linked
123 list of pieces each with a pointer into a buffer and an associated
124 length. Pieces are never deleted but instead always kept around for
125 redo/undo support. A span is a range of pieces, consisting of a start
126 and end piece. Changes to the text are always performed by swapping
127 out an existing, possibly empty, span with a new one.
129 An empty document is represented by two special sentinel pieces which
137 Loading a file from disk is as simple as mmap(2)-ing it into a buffer,
138 creating a corresponding piece and adding it to the double linked list.
139 Hence loading a file is a constant time operation i.e. independent of
140 the actual file size (assuming the operating system uses demand paging).
142 /-+ --> +-----------------+ --> +-\
143 | | | I am an editor! | | |
144 \-+ <-- +-----------------+ <-- +-/
150 Inserting a junk of data amounts to appending the new content to a
151 modification buffer. Followed by the creation of new pieces. An insertion
152 in the middle of an existing piece requires the creation of 3 new pieces.
153 Two of them hold references to the text before respectively after the
154 insertion point. While the third one points to the newly added text.
156 /-+ --> +---------------+ --> +----------------+ --> +--+ --> +-\
157 | | | I am an editor| |which sucks less| |! | | |
158 \-+ <-- +---------------+ <-- +----------------+ <-- +--+ <-- +-/
161 modification buffer content: "which sucks less"
163 During this insertion operation the old span [3,3] has been replaced
164 by the new span [4,6]. Notice that the pieces in the old span were not
165 changed, therefore still point to their predecessors/successors, and can
166 thus be swapped back in.
168 If the insertion point happens to be at a piece boundary, the old span
169 is empty, and the new span only consists of the newly allocated piece.
174 Similarly a delete operation splits the pieces at appropriate places.
176 /-+ --> +-----+ --> +--+ --> +-\
178 \-+ <-- +-----+ <-- +--+ <-- +-/
181 Where the old span [4,5] got replaced by the new span [7,7]. The underlying
182 buffers remain unchanged.
187 Notice that the common case of appending text to a given piece is fast
188 since, the new data is simply appended to the buffer and the piece length
189 is increased accordingly. In order to keep the number of pieces down,
190 the least recently edited piece is cached and changes to it are done
191 in place (this is the only time buffers are modified in a non-append
192 only way). As a consequence they can not be undone.
197 Since the buffers are append only and the spans/pieces are never destroyed
198 undo/redo functionality is implemented by swapping the required spans/pieces
201 As illustrated above, each change to the text is recorded by an old and
202 a new span. An action consists of multiple changes which logically belong
203 to each other and should thus also be reverted together. For example
204 a search and replace operation is one action with possibly many changes
207 The text states can be marked by means of a snapshotting operation.
208 Snapshotting saves a new node to the history graph and creates a fresh
209 Action to which future changes will be appended until the next snapshot.
211 Actions make up the nodes of a connected digraph, each representing a state
212 of the file at some time during the current editing session. The edges of the
213 digraph represent state transitions that are supported by the editor. The edges
214 are implemented as four Action pointers (`prev`, `next`, `earlier`, and `later`).
216 The editor operations that execute the four aforementioned transitions
217 are `undo`, `redo`,`earlier`, and `later`, respectively. Undo and
218 redo behave in the traditional manner, changing the state one Action
219 at a time. Earlier and later, however, traverse the states in chronological
220 order, which may occasionally involve undoing and redoing many Actions at once.
225 Because we are working with a persistent data structure marks can be
226 represented as pointers into the underlying (append only) buffers.
227 To get the position of an existing mark it suffices to traverse the
228 list of pieces and perform a range query on the associated buffer
229 segments. This also nicely integrates with the undo/redo mechanism.
230 If a span is swapped out all contained marks (pointers) become invalid
231 because they are no longer reachable from the piece chain. Once an
232 action is undone, and the corresponding span swapped back in, the
233 marks become visible again. No explicit mark management is necessary.
238 The main advantage of the piece chain as described above is that all
239 operations are performed independent of the file size but instead linear
240 in the number of pieces i.e. editing operations. The original file buffer
241 never changes which means the `mmap(2)` can be performed read only which
242 makes optimal use of the operating system's virtual memory / paging system.
244 The maximum editable file size is limited by the amount of memory a process
245 is allowed to map into its virtual address space, this shouldn't be a problem
246 in practice. The whole process assumes that the file can be used as is.
247 In particular the editor assumes all input and the file itself is encoded
248 as UTF-8. Supporting other encodings would require conversion using `iconv(3)`
249 or similar upon loading and saving the document.
251 Similarly the editor has to cope with the fact that lines can be terminated
252 either by `\n` or `\r\n`. There is no conversion to a line based structure in
253 place. Instead the whole text is exposed as a sequence of bytes. All
254 addressing happens by means of zero based byte offsets from the start of
257 The main disadvantage of the piece chain data structure is that the text
258 is not stored contiguous in memory which makes seeking around somewhat
259 harder. This also implies that standard library calls like the `regex(3)`
260 functions can not be used as is. However this is the case for all but
261 the most simple data structures used in text editors.
266 This section contains some ideas for further architectural changes.
268 Event loop with asynchronous I/O
269 --------------------------------
271 The editor core should feature a proper main loop mechanism supporting
272 asynchronous non-blocking and always cancelable tasks which could be
273 used for all possibly long lived actions. Ideally the editor core would
274 never block and always remain responsive.
276 Client/Server Architecture / RPC interface
277 ------------------------------------------
279 In principle it would be nice to follow a similar client/server approach
280 as [sam/samterm](http://sam.cat-v.org/) i.e. having the main editor as a
281 server and each window as a separate client process with communication
282 over a unix domain socket.
284 That way window management would be taken care of by dwm or dvtm and the
285 different client processes would still share common cut/paste registers
288 This would also enable a language agnostic plugin system.
290 Efficient Search and Replace
291 ----------------------------
293 Currently the editor copies the whole text to a contiguous memory block
294 and then uses the standard regex functions from libc. Clearly this is not
295 a satisfactory solution for large files.
297 The long term solution is to write our own regular expression engine or
298 modify an existing one to make use of the iterator API. This would allow
299 efficient search without having to double memory consumption.
301 The used regex engine should use a non-backtracking algorithm. Useful
304 - [Russ Cox's regex page](http://swtch.com/~rsc/regexp/)
305 - [TRE](https://github.com/laurikari/tre) as
306 [used by musl](http://git.musl-libc.org/cgit/musl/tree/src/regex)
307 which uses a parallel [TNFA matcher](http://laurikari.net/ville/spire2000-tnfa.ps)
308 - [Plan9's regex library](http://plan9.bell-labs.com/sources/plan9/sys/src/libregexp/)
309 which has its root in Rob Pike's sam text editor
310 - [RE2](https://github.com/google/re2) C++ regex library
315 Feel free to join `#vis-editor` on freenode to discuss development related issues.
317 A quick overview over the code structure to get you started:
319 File(s) | Description
320 ------------------- | -----------------------------------------------------
321 `array.[ch]` | dynamically growing array, can store arbitrarily sized objects
322 `buffer.[ch]` | dynamically growing buffer used for registers and macros
323 `config.def.h` | definition of default key bindings (mapping of key actions)
324 `lexers/` | Lua LPeg based lexers used for syntax highlighting
325 `main.c` | key action definitions, program entry point
326 `map.[ch]` | crit-bit tree based map supporting unique prefix lookups and ordered iteration, used to implement `:`-commands and run time key bindings
327 `register.[ch]` | register implementation, system clipboard integration via `vis-clipboard`
328 `ring-buffer.[ch]` | fixed size ring buffer used for the jump list
329 `sam.[ch]` | structural regular expression based command language
330 `text.[ch]` | low level text / marks / {un,re}do tree / piece table implementation
331 `text-motions.[ch]` | movement functions take a file position and return a new one
332 `text-objects.[ch]` | functions take a file position and return a file range
333 `text-regex.[ch]` | text search functionality, designated place for regex engine
334 `text-util.[ch]` | text related utility functions mostly dealing with file ranges
335 `ui-curses.[ch]` | a terminal / curses based user interface implementation
336 `ui.h` | abstract interface which has to be implemented by ui backends
337 `view.[ch]` | ui-independent viewport, shows part of a file, syntax highlighting, cursor placement, selection handling
338 `vis-cmds.c` | vi(m) `:`-command implementation
339 `vis-core.h` | internal header file, various structs for core editor primitives
340 `vis.c` | vi(m) specific editor frontend implementation
341 `vis.h` | vi(m) specific editor frontend library public API
342 `vis-lua.[ch]` | Lua bindings, exposing core vis APIs for in process extension
343 `vis-modes.c` | vi(m) mode switching, enter/leave event handling
344 `vis-motions.c` | vi(m) cursor motion implementations, uses `text-motions.h` internally
345 `vis-operators.c` | vi(m) operator implementation
346 `vis-prompt.c` | `:`, `/` and `?` prompt implemented as a regular file/window with custom key bindings
347 `vis-text-objects.c`| vi(m) text object implementations, uses `text-objects.h` internally
348 `vis.lua` | Lua library for vis, providing parts of the exposed API
349 `visrc.lua` | Lua startup and configuration script
351 Testing infrastructure for the [low level core data structures]
352 (https://github.com/martanne/vis-test/tree/master/core), [vim compatibility]
353 (https://github.com/martanne/vis-test/tree/master/vim), [sam compatibility]
354 (https://github.com/martanne/vis-test/tree/master/sam), [vis specific features]
355 (https://github.com/martanne/vis-test/tree/master/vis) and the [Lua API]
356 (https://github.com/martanne/vis-test/tree/master/lua) is in place, but
357 lacks proper test cases.