eBay's TSV Utilities

Command line tools for large tabular data files.


Project maintained by eBay Hosted on GitHub Pages — Theme by mattgraham

Visit the Tools Reference main page
Visit the TSV Utilities main page

keep-header reference

Synopsis: keep-header [fileā€¦] -- program [args]

Execute a command against one or more files in a header-aware fashion. The first line of each file is assumed to be a header. The first header is output unchanged. Remaining lines are sent to the given command via standard input, excluding the header lines of subsequent files. Output from the command is appended to the initial header line. A double dash (--) delimits the command, similar to how the pipe operator (|) delimits commands.

The following commands sort files in the usual way, except for retaining a single header line:

$ keep-header file1.txt -- sort
$ keep-header file1.txt file2.txt -- sort -k1,1nr

Data can also be read from standard input. For example:

$ cat file1.txt | keep-header -- sort
$ keep-header file1.txt -- sort -r | keep-header -- grep red

The last example can be simplified using a shell command:

$ keep-header file1.txt -- /bin/sh -c '(sort -r | grep red)'

keep-header is especially useful for commands like sort and shuf that reorder input lines. It is also useful with filtering commands like grep, many awk uses, and even tail, where the header should be retained without filtering or evaluation.

keep-header works on any file where the first line is delimited by a newline character. This includes all TSV files and the majority of CSV files. It won't work on CSV files having embedded newlines in the header.

Options: