eBay's TSV Utilities

Command line tools for large tabular data files.


Project maintained by eBay Hosted on GitHub Pages — Theme by mattgraham

Visit the Tools Reference main page
Visit the TSV Utilities main page

tsv-append reference

Synopsis: tsv-append [options] [fileā€¦]

tsv-append concatenates multiple TSV files, similar to the Unix cat utility. Unlike cat, it is header-aware (--H|header), writing the header from only the first file. It also supports source tracking, adding a column indicating the original file to each row. Results are written to standard output.

Concatenation with header support is useful when preparing data for traditional Unix utilities like sort and sed or applications that read a single file.

Source tracking is useful when creating long/narrow form tabular data, a format used by many statistics and data mining packages. In this scenario, files have been used to capture related data sets, the difference between data sets being a condition represented by the file. For example, results from different variants of an experiment might each be recorded in their own files. Retaining the source file as an output column preserves the condition represented by the file.

The file-name (without extension) is used as the source value. This can customized using the --f|file option.

Example: Header processing:

$ tsv-append -H file1.tsv file2.tsv file3.tsv

Example: Header processing and source tracking:

$ tsv-append -H -t file1.tsv file2.tsv file3.tsv

Example: Source tracking with custom source values:

$ tsv-append -H -s test_id -f test1=file1.tsv -f test2=file2.tsv

Options: