Delimited text converter (DTC)

Contents


Input format

The input file should be a delimited text file. Usually the delimiters are tabs or spaces, but others may be specified within the form.

Setting the parameters

Specifying the delimiter
The delimiter may be specified, if it is not already detected, by selecting the appropriate option. In addition to the supplied options, you may specify the delimiter in the text box. The special keywords "tab", "space", and "spaces" can be used.

Treat all consecutive delimiters as one
Although rare, in addition to specifying the actual delimiter, in some cases it may be necessary to treat multiple, consecutive delimiters as a single delimiter. An example is a case where any number of tabs have been used between fields that also contain spaces, and it is not desirable to treat everything between adjacent tabs as empty fields.

Pad rows to specific width.
If this option is selected, all rows in the file will have fields added to the end in order that they have at least as many fields as the number you specify. Lines that are longer are not changed.

Replace missing fields
If there are empty fields, it is often useful to replace them with something. Iclust, for example, expects missing data to be represented by 'NaN'. This is the standard for this system.

Strip empty leading columns
With this selected, any of the first columns that contain only blank fields in all rows will be removed.

Strip empty trailing columns
With this selected, any of the last columns that contain only blank fields in all rows will be removed.

Remove blank lines
If this is selected, blank lines will be removed. Blank lines are any lines that contain zero or more empty fields (that are completely blank or contain only delimiters).

Change the delimiter
With these options it is possible to convert all the delimiters in the file to whatever you specify. In addition to the supplied options, you may specify the delimiter in the text box. The special keywords "tab", "space", and "spaces" can be used.

Output

The output is a tab-delimited file containing the matrix.