4 - [kvpairs2td](#kvpairs2td)
7 - [td2kvpairs](#td2kvpairs)
9 - [td-add-headers](#td-add-headers)
10 - [td-alter](#td-alter)
11 - [td-collapse](#td-collapse)
12 - [td-disamb-headers](#td-disamb-headers)
13 - [td-expand](#td-expand)
14 - [td-filter](#td-filter)
15 - [td-gnuplot](#td-gnuplot)
16 - [td-keepheader](#td-keepheader)
17 - [td-lpstat](#td-lpstat)
19 - [td-pivot](#td-pivot)
21 - [td-rename](#td-rename)
22 - [td-select](#td-select)
24 - [td-trans](#td-trans)
25 - [td-trans-fixcol](#td-trans-fixcol)
26 - [td-trans-group](#td-trans-group)
27 - [td-trans-gshadow](#td-trans-gshadow)
28 - [td-trans-ls](#td-trans-ls)
29 - [td-trans-mount](#td-trans-mount)
30 - [td-trans-passwd](#td-trans-passwd)
31 - [td-trans-shadow](#td-trans-shadow)
38 csv2td - Transform CSV to tabular data format.
42 Read CSV data on STDIN.
43 Output tabular data to STDOUT.
47 Any option which Text::CSV(3pm) takes.
48 See `Text::CSV-`known\_attributes> for extensive list.
51 csv2td --sep=';' --blank-is-undef=0 --binary
55 Text::CSV->new({sep=>";", blank_is_undef=>0, binary=>1})
59 Why there is no td2csv?
61 Why would you go back to ugly CSV when you have nice shiny Tabdata?
65 [csv2](#csv2)(1), [mrkv2td](#mrkv2td)(1)
71 kvpairs2td - Transform lines of key-value pairs to tabular data stream
75 - -i, --ignore-non-existing-columns
77 Do not fail when encounters a new field after the first record.
79 - -w, --warn-non-existing-columns
80 - -c, --column _COLUMN_
82 Indicate that there will be a column by the name _COLUMN_.
83 This is useful if the first record does not have _COLUMN_.
84 This option is repeatable.
86 - -r, --restcolumn _NAME_
88 Name of the column where the rest of the input line will be put
89 which is not part of key-value pairs.
90 Default is **\_REST**.
92 - -u, --unknown-to-rest
94 Put unknown (non-existing) fields in the "rest" column
99 [td2mrkv](#td2mrkv)(1), [td2kvpairs](#td2kvpairs)(1)
105 mrkv2td - Transform multi-record key-value (MRKV) stream to tabular data format.
109 As tabular data format presents field names at the start of transmission,
110 [mrkv2td](#mrkv2td)(1) infers them only from the first record,
111 so no need to buffer the whole dataset to find all fields,
112 and it's usual for all records to have all fields anyways.
116 - -s, --separator _REGEXP_
118 Regexp which separates field name from cell data in MRKV stream.
119 Default is TAB (`\t`).
121 - -g, --multiline-glue _STRING_
125 [td2mrkv](#td2mrkv)(1)
131 td2html - Transform tabular data stream into a HTML table.
139 Takes a tabular data stream on STDIN and outputs a HTML table
140 enclosed in `<table>...</table>` tags.
146 td2kvpairs - Transform tabular data into key-value pairs
150 - -r, --prefix-field _NAME_
152 Put this field's content before the list of key-value pairs.
153 Default is **\_REST**.
154 Prefix and the key-value pairs are separated by a space char,
155 if there is any prefix.
159 [td2mrkv](#td2mrkv)(1), [kvpairs2td](#kvpairs2td)(1)
165 td2mrkv - Transform tabular data into multi-record key-value (MRKV) format.
169 - -s, --separator _STR_
171 String to separate field name from content.
172 Default is TAB (`\t`).
176 getent passwd | tr : "\\t" | td-add-headers USER PW UID GID GECOS HOME SHELL | td-select +ALL -PW | td2mrkv
180 [mrkv2td](#mrkv2td)(1), [td2html](#td2html)(1)
186 td-add-headers - Add headers to the tabular data stream and pass through the rows.
190 td-add-headers _COLNAME\_1_ _COLNAME\_2_ ...
194 Add header row to the tabular data stream. Headers names will be the
195 ones specified in the command line arguments, from the left-most 1-by-1.
197 If there are more fields in the first data row, then additional columns
198 will be added with names like "COL4", "COL5", etc. by the index number
199 of the column counting from 1.
200 This may be prevented by --no-extra-columns option.
204 - -x, --extra-columns
206 Give a name also to those columns which are not given name in the command parameters.
208 - -X, --no-extra-columns
210 Do not add more columns than specified in the command parameters.
214 who | td-trans | td-add-headers USER TTY DATE TIME COMMENT
220 td-alter - Add new columns and fields to tabular data stream, and modify value of existing fields.
224 td-alter _COLUMN_=_EXPR_ \[_COLUMN_=_EXPR_ \[_COLUMN_=_EXPR_ \[...\]\]\]
228 On each data row, sets field in _COLUMN_ to the value resulted by _EXPR_
231 In _EXPR_, you may refer to other fields by `$F{NAME}` where _NAME_ is the column name;
232 or by `$F[INDEX]` where _INDEX_ is the 0-based column index number.
233 Furthermore you may refer to uppercase alpha-numeric field names, simply by bareword `COLUMN`,
234 well, enclosed in paretheses like `(COLUMN)` to avoid parsing unambiguity in Perl.
235 It's possible because these column names are set up as subroutines internally.
237 Topic variable (`$_`) initially is set to the current value of _COLUMN_ in _EXPR_.
238 So for example `N='-$_'` makes the field N the negative of itself.
240 You can create new columns simply by referring to a _COLUMN_ name that does not exist yet.
241 You can refer to an earlier defined _COLUMN_ in subsequent _EXPR_ expressions.
245 Add new columns: TYPE and IS\_BIGFILE.
246 IS\_BIGFILE depends on previously defined TYPE field.
248 ls -l | td-trans-ls | td-alter TYPE='substr MODE,0,1' IS_BIGFILE='SIZE>10000000 && TYPE ne "d" ? "yes" : "no"'
250 Strip sub-seconds and timezone from DATETIME field:
252 TIME_STYLE=full-iso ls -l | td-trans-ls | td-alter DATETIME='s/\..*//; $_'
262 show headers (default)
266 "Alter" in td-alter comes from SQL.
267 [td-alter](#td-alter)(1) can change the "table" column layout.
268 But contrary to SQL's ALTER TABLE, [td-alter](#td-alter)(1) can modify the records too, so akin to SQL UPDATE as well.
274 td-collapse - Collapse multiple tabular data records with equivalent keys into one.
278 td-collapse \[_OPTIONS_\]
282 It goes row-by-row on a sorted tabular data stream
283 and if 2 or more subsequent rows' first (key) cell are
284 the same then collapse them into one row.
285 This is done by joining corresponding cells' data from each row into one
286 cell, effectively keeping every column's data in the same column.
288 If you want to group by an other column, not the first one, then first
289 reorder the columns by [td-select](#td-select)(1). Eg. `td-select KEYCOLUMN +REST`.
295 Delimiter character or string between joined cell data.
298 - -u, --distribute-unique-field _FIELD_
300 Take the _FIELD_ column's cells from the first collapsed group,
301 and multiplicate all other columns as many times as many rows are in this group,
302 in a way that each cell goes under a new column corresponding to that cell's original row.
303 _FIELD_ field's cells need to be unique within each groups.
305 If an unexpected value found during processing the 2nd row group and onwards,
306 ie. a value which was not there in the first group,
307 it won't be distibuted into the new column, since the header is already sent,
308 but left in the original column just like **-u** option would not be in effect.
309 See "pause" and "resume" in the example below.
313 ID | EVENT | TIME | STATUS
315 15 | end | 10:05 | ok
317 16 | end | 11:06 | err
319 16 | resume | 11:05 |
323 COUNT | ID | EVENT | TIME | TIME_start | TIME_end | STATUS | STATUS_start | STATUS_end
324 2 | 15 | | | 10:00 | 10:05 | | | ok
325 4 | 16 | pause resume | 11:04 11:05 | 11:00 | 11:06 | | | err
327 - -s, --distributed-column-name-separator _STR_
329 When generating new columns as described at **-u** option,
330 join the original column name with each of the unique field's values
332 See example at **-u** option description.
333 Default is underscore `_`.
337 This pipeline shows which users are using each of the configured default
338 shells, grouped by shell path.
340 # get the list of users
343 # transform into tabular data stream
345 td-add-headers USER X UID GID GECOS HOME SHELL |\
347 # put the shell in the first column, and sort, then collapse
348 td-select SHELL USER | td-keepheader sort | td-collapse -g ' ' |\
350 # change header name "USER" to "USERS"
351 td-alter USERS=USER | td-select +ALL -USER
356 | COUNT | SHELL | USERS |
357 | 4 | /bin/bash | user1 user2 nova root |
358 | 5 | /bin/false | fetchmail hplip sddm speech-dispatcher sstpc |
359 | 1 | /bin/sync | sync |
360 | 1 | /sbin/rebootlogon | reboot |
361 | 6 | /usr/sbin/nologin | _apt avahi avahi-autoipd backup bin daemon |
365 Have to sort input data first.
367 Group key is always the first input column.
369 If a row in the input data has more cells than the number of columns, those are ignored.
373 [td-expand](#td-expand)(1) is a kind of an inverse to [td-collapse](#td-collapse)(1).
377 [td-collapse](#td-collapse)(1) roughly translates to SELECT COUNT(\*) + GROUP\_CONCAT() + GROUP BY in SQL.
383 td-disamb-headers - Disambiguate headers in tabular data
387 Change column names in input tabular data stream by appending a sequential number
388 to the duplicated column names.
389 The first occurrance is kept as-is.
390 If a particular column name already ends with an integer, it gets incremented.
394 echo "PID PID PID2 PID2 USER CMD" | td-disamb-headers
398 PID PID3 PID2 PID4 USER CMD
404 td-expand - Generate multiple rows from each one row in a Tabular data stream.
408 td-expand \[-f _FIELD_\] \[-s _SEPARATOR_\]
412 It goes row-by-row and splits the given _FIELD_ at _SEPARATOR_ chars,
413 creates as many rows on the output as many parts _FIELD_ is split into,
414 fills the _FIELD_ column in each row by one of the parts,
415 and fills all other columns in all resulted rows with the corresponding column's data in the input.
420 | /bin/bash | user1 user2 |
421 | /bin/dash | user3 user4 |
424 td-expand -f USERS -s ' ' | td-alter USER=USERS | td-select +ALL -USERS
427 | /bin/bash | user1 |
428 | /bin/bash | user2 |
429 | /bin/dash | user3 |
430 | /bin/dash | user4 |
435 - -f, --field _FIELD_
437 Which field to break up.
438 Default is always the first one.
440 - -s, --separator _PATTERN_
442 Regexp pattern to split _FIELD_ at.
447 [td-collapse](#td-collapse)(1) is a kind of inverse to [td-expand](#td-expand)(1).
453 td-filter - Show only those records from the input tabular data stream which match to the conditions.
457 td-filter \[_OPTIONS_\] \[--\] _COLUMN_ _OPERATOR_ _R-VALUE_ \[\[**or**\] _COLUMN_ _OPERATOR_ _R-VALUE_ \[\[**or**\] ...\]\]
459 td-filter \[_OPTIONS_\] --perl _EXPR_
463 Pass through those records which match at least one of the conditions (inclusive OR).
464 A condition consists of a triplet of _COLUMN_, _OPERATOR_, and _R-VALUE_.
465 You may put together conditions conjunctively (AND) by chaining multiple [td-filter](#td-filter)(1) commands by shell pipes.
468 td-filter NAME eq john NAME eq jacob | tr-filter AGE -gt 18
470 This gives the records with either john or jacob, and all of them will be above 18.
472 The optional word "**or**" between triplets makes your code more explicite.
474 [td-filter](#td-filter)(1) evaluates the Perl expression in the second form and passes through records
475 only if the result is true-ish in Perl (non zero, non empty string, etc).
476 Each field's value is in `@F` by index, and in `%F` by column name.
477 You can implement more complex conditions in this way.
487 show headers (default)
489 - -i, --ignore-non-existing-columns
491 do not treat non-existing (missing or typo) column names as failure
493 - -w, --warn-non-existing-columns
495 only show warning on non-existing (missing or typo) column names, but don't fail
497 - -N, --no-fail-non-numeric
499 do not fail when a non-numeric r-value is given to a numeric operator
501 - -W, --no-warn-non-numeric
503 do not show warning when a non-numeric r-value is given to a numeric operator
507 These operators are supported, semantics are the same as in Perl, see [perlop](#perlop)(1).
509 == != <= >= < > =~ !~ eq ne gt lt
511 For your convenience, not to bother with escaping, you may also use these operators as alternatives to the canonical ones above:
514 - = _(single equal sign)_
516 string equality (**eq**)
520 string inequality (**ne**)
524 numeric equality (**==**)
528 numeric inequality (**!=**)
532 numeric inequality (**!=**)
536 numeric greater than (**>**)
540 numeric greater or equal (**>=**)
544 numeric less than (**<**)
548 numeric less or equal (**<=**)
553 regexp match (**=~**)
559 negated regexp match (**!~**)
566 _R-VALUE_ is split into pieces by commas (`,`) and
567 equality to at least one of them is required.
568 Equality to none of them is required if the operator is negated.
570 - contains \[whole word\]
573 Plural form "contain" is also accepted.
574 Optional _whole word_ is a literal part of the operator.
576 - contains \[one | any\] \[whole word\] of
578 Similar to **is one of**, but substring match is checked
579 instead of full string equality.
580 Plural form "contain" is also accepted.
581 Optional _whole word_ is a literal part of the operator.
586 Plural forms are also accepted.
588 Operators may be preceeded by _not_, _does not_, _do not_ to negate their effect.
592 If there is no _COLUMN_ column in the input data, it's silently considered empty.
593 [td-filter](#td-filter)(1) does not need _R-VALUE_ to be quoted or escaped, however your shell may do.
597 [td-filter](#td-filter)(1) is analogous to SQL WHERE.
603 td-gnuplot - Graph tabular data using [gnuplot](#gnuplot)(1)
607 td-gnuplot \[_OPTIONS_\]
611 Invoke [gnuplot](#gnuplot)(1) to graph the data represented in Tabular data format on STDIN.
612 The first column is the X axis, the rest of the columns are data lines.
614 Default is to output an ascii-art chart to the terminal ("dumb" output in gnuplot).
616 td-gnuplot guesses the data format from the column names.
617 If the 0th column matches to "date" or "time" (case insensitively) then the X axis will be a time axis.
618 If the 0th column matches to "time", then unix epoch timetamp is assumed.
619 Otherwise specify what date/time format is used by eg. **--timefmt=%Y-%m-%d** option.
621 Plot data read from STDIN is buffered in a temp file
622 (provided by `File::Temp->new(TMPDIR=>1)` and immediately unlinked so no waste product left around),
623 because [gnuplot](#gnuplot)(1) need to seek in it when plotting more than 1 data series.
629 Output an image (PNG) to the STDOUT,
630 instead of drawing to the terminal.
634 Let [gnuplot](#gnuplot)(1) decide the output medium,
635 instead of drawing to the terminal.
638 - --_SETTING_=_VALUE_
640 Set any gnuplot setting, optionally set its value to _VALUE_.
641 _SETTING_ is a setting name used in `set ...` gnuplot commands, except spaces replaced with dasshes.
642 _VALUE_ is always passed to gnuplot enclosed in double quotes.
646 --xtics-rotate-by=-90
649 Gnuplot equivalent command:
652 set xtics rotate by "-90"
657 Pass arbitrary gnuplot commands to gnuplot.
658 This option may be repeated.
659 This is passed to [gnuplot](#gnuplot)(1) in command line (**-e** option)
660 after [td-grnuplot](#td-grnuplot)(1)'s own sequence of gnuplot setup commands
661 and after the **--_SETTING_** settings are applied,
662 so you can override them.
668 td-keepheader - Plug a non header-aware program in the tabular-data processing pipeline
672 td-keepheader \[--\] <COMMAND> \[<ARGS>\]
676 ls -l | td-trans-ls | td-select NAME +REST | td-keepheader sort | tabularize
682 td-lpstat - [lpstat](#lpstat)(1) wrapper to output printers status in Tabular Data format
688 td-ls - [ls](#ls)(1)-like file list but more machine-parseable
692 td-ls \[_OPTIONS_\] \[_PATHS_\] \[-- _FIND-OPTIONS_\]
694 ## OPTIONS, [ls](#ls)(1)-compatible
701 - -n, --numeric-uid-gid
703 - --time=\[atime, access, use, ctime, status, birth, creation, mtime, modification\]
705 - -U (implied, pipe to [sort](#sort)(1) if you want)
707 ## OPTIONS, not [ls](#ls)(1)-compatible
711 - --no-symlink-target
712 - --add-field _FIELD-NAME_
714 Add extra fields by name.
715 See field names by **--help-field-names** option.
716 May be added multiple times.
718 - --add-field-macro _FORMAT_
720 Add extra fields by [find](#find)(1)-style format specification.
721 For valid _FORMAT_s, see **-printf** section in [find](#find)(1).
722 May be added multiple times.
723 Putting `\\0` (backslash-zero) in _FORMAT_ screws up the output; don't do that.
727 Show valid field names to be used for **--add-field** option.
731 Columns are similar to good old [ls](#ls)(1):
732 PERMS (symbolic representation),
734 USERNAME (USERID if **-n** option is given),
735 GROUPNAME (GROUPID if **-n** option is given),
737 time field is either ATIME, CTIME, or default MTIME (in full-iso format),
738 BASENAME (or RELPATH in **--recursive** mode),
739 and SYMLINKTARGET (unless **--no-symlink-target** option is given).
741 Column names are a bit different than [td-trans-ls](#td-trans-ls)(1) produces, but this is intentional,
742 because fields by these 2 tools have slightly different meaning.
743 [td-trans-ls](#td-trans-ls)(1) is less smart because it just transforms [ls](#ls)(1)'s output and
744 does not always know what is in the input exactly; while [td-ls](#td-ls)(1) itself controls
745 what data goes to the output.
751 Output format is tabular data: a table, in which fields are delimited by TAB
752 and records by newline (LF).
754 Meta chars may occur in some fields (path, filename, symlink target, etc),
755 these are escaped this (perl-compatible) way:
757 | Raw char | Substituted to |
758 |-----------|----------------|
765 Other control chars (charcode below 32 in ASCII)
766 including NUL, vertical-tab, and form-feed are left as-is.
772 **TIME\_STYLE** is ignored as well as _--time-style_ option.
773 Always show date-time in `%F %T %z` [strftime](#strftime)(3) format!
774 It's simply the most superior.
775 Equivalent to **TIME\_STYLE=full-iso**.
779 [td-select](#td-select)(1), [td-filter](#td-filter)(1), [td-trans-ls](#td-trans-ls)(1)
785 td-pivot - Switch columns for rows in tabular data
793 Must read and buffer the whole STDIN before output any data,
794 so inpractical on large data.
804 td-rename - Rename tabular data columns
808 td-rename _OLDNAME_ _NEWNAME_ \[_OLDNAME_ _NEWNAME_ \[_OLDNAME_ _NEWNAME_ \[...\]\]\]
812 conntrack -L | sd '^(\S+)\s+(\S+)\s+(\S+)' 'protoname=$1 protonum=$2 timeout=$3' | kvpairs2td | td-rename _REST FLAGS
816 Not to confuse with [rename.td](#rename.td)(1) which renames files, not columns.
822 td-select - Show only the specified columns from the input tabular data stream.
826 td-select \[_OPTIONS_\] \[--\] \[-\]_COLUMN_ \[\[-\]_COLUMN_ \[...\]\]
836 show headers (default)
838 - -i, --ignore-non-existing-columns
840 do not treat non-existing (missing or typo) column names as failure
842 - -w, --warn-non-existing-columns
844 only show warning on non-existing (missing or typo) column names, but
849 warn and fail on non-existing (missing or typo) column names given in
850 parameters, even if it's prefixed with hyphen, ie. when the user want to
851 remove the named column from the output.
855 _COLUMN_ is either a column name,
856 or one of these special keywords:
864 the rest of columns not given yet in the parameter list
866 _COLUMN_ is optionally prefixed with minus (`-`),
867 in which case the given column will not be shown,
868 ie. removed from the shown columns.
870 So if you want to show all columns except one or two:
872 td-select +ALL -PASSWD
874 If you want to put a given column (say "KEY") to the first place and left others intact:
880 ls -l | td-trans-ls | td-select -- NAME +REST -INODE -LINKS -MAJOR -MINOR
884 "Select" in td-select comes from SQL.
885 Similarly to SQL, [td-select](#td-select)(1) is to choose some of the columns and return them in the given order.
891 td-sort - Sort tabular data by the columns given by name
899 All those which are accepted by [sort](#sort)(1),
900 except you don't need to refer to columns by ordinal number,
905 [sort](#sort)(1) defines _KEYDEF_ as `F[.C][OPTS][,F[.C][OPTS]]`,
906 where **F** is the (1-based) field number.
907 However with [td-sort](#td-sort)(1) you may refer to fields by name.
908 But since **F** is no longer consists only of digits,
909 but is an arbitrary string,
910 it's may be ambiguous where the name ends.
911 So you may enclose them in round/square/curly/angle brackets.
912 Choose the one which does not occur in the column name.
914 You don't need to even type **-k**, because a lone _COLUMN-NAME_
915 is interpreted as "**-k** _F_" where _F_ is the corresponding field number.
919 [td-sort](#td-sort)(1) is analogous to SQL ORDER BY.
925 td-trans - Transform whitespace-delimited into TAB-delimited lines ignoring sorrounding whitespace.
929 - -m, --max-columns _NUM_
931 Maximum number of columns.
932 The _NUM_th column may have any whitespace.
933 By default it's the number of fields in the header (first line).
939 td-trans-fixcol - Transform a table-looking text, aligned to fixed columns by spaces, into tabular data.
943 First line is the header consisting of the column names.
944 Each field's text must start in the same terminal column as the column name.
948 - -m, --min-column-spacing _NUM_
950 Minimum spacing between columns.
952 This allows the input data to have column names with single spaces.
956 arp -n | td-trans-fixcol
970 td-trans-ls - Transform [ls](#ls)(1) output into fix number of TAB-delimited columns.
978 Supported [ls](#ls)(1) options which affect its output format:
983 - --time-style={iso,long-iso,full-iso}
990 - --time-style=locale
996 td-trans-mount - Transform [mount](#mount)(1) output to tabular data stream.
1000 Supported [mount](#mount)(1) options which affect output format:
1006 mount | td-trans-mount
1008 mount -l | td-trans-mount
1022 vcf2td - Transform VCF to tabular data format.
1026 - -c, --column _COLUMN_
1028 Indicate that there will be a column by the name _COLUMN_.
1029 Useful if the first record does not contain all fields
1030 which are otherwise occur in the whole data stream.
1031 By default, [vcf2td](#vcf2td)(1) recognize fields which are in the first record in the VCF input,
1032 does not read ahead more records before sending the header.
1033 This option is repeatable.
1035 - -i, --ignore-non-existing-columns
1037 Don't fail and don't warn when ecountering new field names.
1039 Tabular data format declares all of the field names in the column headers,
1040 so it can not introduce new columns later on in the data stream
1041 (unless some records were buffered which are not currently).
1042 However in VCF, each record may have fields different from the first record.
1043 That's why [vcf2td](#vcf2td)(1) fails itself by default
1044 if it encounters a field it can not convert to tabular.
1046 - -w, --warn-non-existing-columns
1048 Only warns on new fields, but don't fail.
1050 - -g, --multivalue-glue _STR_
1052 A string to glue repeated fields' values together
1053 when the repeated fields are handled by uniting their content into one tabdata column.
1056 Note, eventhough newline is the default glue, but
1057 if you want to be explicit about it (or want to set an other glue _STR_ expressed often by some backslash sequence),
1058 `vcf2td -g "\n" ...` probably won't quite work as one may expect (depending on one's shell),
1059 because the shell passes the "backslash" + "n" 2-chars string,
1060 instead of a string consisting just 1 "newline" char.
1061 So, in bash, put it as `vcf2td -g $'\n' ...`.
1063 ## COMMON vCard FIELDS
1067 **N** is for a contact's name, different parts separated by `;` semicolon.
1068 [vcf2td](#vcf2td)(1) simplifies the **N** field by removing excess semicolons.
1069 If you need one or more name parts precisely,
1070 request the **N.family**, **N.given**, **N.middle**, **N.prefixes** fields
1071 by the **-c** option if you want,
1072 but this name partitioning method is not quite internationally useful,
1073 use the **FN** (full name) field for persons' names as much as you can.