03-10-2009
splitting tab-delimited file with awk
Hi all, I need help to split a tab-delimited list into separate files by the filename-field. The list is already sorted ascendingly by filename, an example list would look like this;
filename001 word1 word2
filename001 word3 word4
filename002 word1 word2
filename002 word3 word4
filename002 word5 word6
I have tried this using a slightly modified bash script which I found here on unix.com :
awk -F"\t" 'OFS="\t"{print >> ($1 ".tab")}' tabdelimitedfile.tab
This outputs the files filename001.tab and filename002.tab
Unfortuneatly this will not work on larger files than the example listed above if too many different file names are found. I get the error message "awk: *.tab makes too many open files" on exit.
I would be grateful if anyone could suggest an alternative way which outputs one file at a time every time a new filename is encountered.
Cheers
Per
10 More Discussions You Might Find Interesting
1. Shell Programming and Scripting
Hi,
Can anyone let me know on how to convert a Tab delimited file to Comma delimited file in Unix
Thanks!! (22 Replies)
Discussion started by: charan81
22 Replies
2. UNIX for Dummies Questions & Answers
Hey Everybody,
I am having much trouble figuring this out, as I am not really a programmer..:mad:
Datafile.txt
Column0 Column1 Column2
ABC DEF xxxGHI
I am running using WGET on a cronjob to grab a datafile, but I need to cut the first three characters from... (6 Replies)
Discussion started by: rickdini
6 Replies
3. UNIX for Dummies Questions & Answers
How do I use awk to log transform the fifth column of a tab-delimited text file? Thanks! (1 Reply)
Discussion started by: evelibertine
1 Replies
4. UNIX for Dummies Questions & Answers
Hi Forum
I have a tab delimited file that opens well in Openoffice calc (excel). But when I perform any operation in command line, it reads the file incorrectly. When I 'save As' the same file in office as tab delimited then it works fine.
The file that I think is tab delimited is actually... (8 Replies)
Discussion started by: imlearning
8 Replies
5. Shell Programming and Scripting
I have a file which was pipe delimited, I need to make it tab delimited. I tried with sed but no use
cat file | sed 's/|//t/g'
The above command substituted "/t" not tab in the place of pipe.
Sample file:
abc|123|2012-01-30|2012-04-28|xyz
have to convert to:
abc 123... (6 Replies)
Discussion started by: karumudi7
6 Replies
6. Shell Programming and Scripting
hi
i have a requirement to input a string to a shell script and to split the string to multiple fields,
the string is copied from a row of three columns (name,age,address) in an excel sheet.
the three columns (from excel) are seperated with a tab when pasted in the command prompt, but when the ... (2 Replies)
Discussion started by: midhun19
2 Replies
7. Shell Programming and Scripting
Hi How to make tab delimited file to space delimited?
in put file:
ABC kgy
jkh ghj
ash kjl
o/p file:
ABC kgy
jkh ghj
ash kjl
Use code tags, thanks. (1 Reply)
Discussion started by: jagdishrout
1 Replies
8. UNIX for Dummies Questions & Answers
Hi, I have a rquirement in unix as below .
I have a text file with me seperated by | symbol and i need to generate a excel file through unix commands/script so that each value will go to each column.
ex:
Input Text file:
1|A|apple
2|B|bottle
excel file to be generated as output as... (9 Replies)
Discussion started by: raja kakitapall
9 Replies
9. UNIX for Beginners Questions & Answers
Hi there,
I would like to use awk to reformat a tab-delimited file containing three columns as follows:
Data file:
sample 1 173
sample 269 530
sample 687 733
sample 1699 1779
Desired output file:
sample 174..265, 531..686, 734..1698
I need the value... (5 Replies)
Discussion started by: emiley
5 Replies
10. UNIX for Beginners Questions & Answers
Hello Everyone..
I want to replace the retail col from FileI with cstp1 col from FileP if the strpno matches in both files
FileP.txt
... (2 Replies)
Discussion started by: YogeshG
2 Replies
LEARN ABOUT DEBIAN
vcf-isec
VCF-ISEC(1) User Commands VCF-ISEC(1)
NAME
vcf-isec - create intersections, unions, complements on bgzipped and tabix indexed VCF or tab-delimited files
SYNOPSIS
vcf-isec [OPTIONS] file1.vcf file2.vcf ...
DESCRIPTION
About: Create intersections, unions, complements on bgzipped and tabix indexed VCF or tab-delimited files.
Note that lines from all files can be intermixed together on the output, which can yield unexpected results.
OPTIONS
-C, --chromosomes <list|file>
Process the given chromosomes (comma-separated list or one chromosome per line in a file).
-c, --complement
Output positions present in the first file but missing from the other files.
-d, --debug
Debugging information
-f, --force
Continue even if the script complains about differing columns.
-o, --one-file-only
Print only entries from the left-most file. Without -o, all unique positions will be printed.
-n, --nfiles [+-=]<int>
Output positions present in this many (=), this many or more (+), or this many or fewer (-) files.
-p, --prefix <path>
If present, multiple files will be created with all possible isec combinations. (Suitable for Venn Diagram analysis.)
-t, --tab <chr:pos:file>
Tab-delimited file with indexes of chromosome and position columns. (1-based indexes)
-w, --win <int>
In repetitive sequences, the same indel can be called at different positions. Consider records this far apart as matching (be it a
SNP or an indel).
-h, -?, --help
This help message.
EXAMPLES
bgzip file.vcf; tabix -p vcf file.vcf.gz bgzip file.tab; tabix -s 1 -b 2 -e 2 file.tab.gz
vcf-isec 0.1.5 July 2011 VCF-ISEC(1)