02-17-2011
I tried this command below since the record is 4000 byte and has 290 ; as delimiters. i need to filter out bad records where the delimiter counts do not match.
awk -F \; '{print>(NF==291?"goodfile1":"rejectedfile1")}' infile
and got this error
awk: syntax error near line 1
awk: bailing out near line 1
I also tried this
awk -F";" 'NF==291{print >"goodfile" ;next}{print >"rejected"}' infile
and get different error
awk: record `00000036200800;20080...' too long
it seems like awk has limitation on the record length :-( google directed me to simple change to the command. and nawk worked well. Thank you everyone!
nawk -F";" 'NF==291{print >"goodfile" ;next}{print >"rejected"}' infile
Last edited by lv99; 02-17-2011 at 01:26 PM..
9 More Discussions You Might Find Interesting
1. Shell Programming and Scripting
Input file data contents:
>seq_1
MSNQSPPQSQRPGHSHSHSHSHAGLASSTSSHSNPSANASYNLNGPRTGGDQRYRASVDA
>seq_2
AGAAGRGWGRDVTAAASPNPRNGGGRPASDLLSVGNAGGQASFASPETIDRWFEDLQHYE
>seq_3
ATLEEMAAASLDANFKEELSAIEQWFRVLSEAERTAALYSLLQSSTQVQMRFFVTVLQQM
ARADPITALLSPANPGQASMEAQMDAKLAAMGLKSPASPAVRQYARQSLSGDTYLSPHSA... (7 Replies)
Discussion started by: patrick87
7 Replies
2. Shell Programming and Scripting
Below is my perl script:
#!/usr/bin/perl
open(FILE,"$ARGV") or die "$!";
@DATA = <FILE>;
close FILE;
$join = join("",@DATA);
@array = split( ">",$join);
for($i=0;$i<=scalar(@array);$i++){
system ("/home/bin/./program_name_count_length MULTI_sequence_DATA_FILE -d... (5 Replies)
Discussion started by: patrick87
5 Replies
3. Shell Programming and Scripting
I have a directory of files that I need to rename by splitting the first and second halves of the filenames using the delimiter "-O" and then renaming with the second half first, followed by two underscores and then the first half. For example, natfinal1995annvol1_14.pdf -O filenum-20639 will be... (2 Replies)
Discussion started by: swimulator
2 Replies
4. Shell Programming and Scripting
I have file which contains around 5000 lines.
The lines are fixed legth but having no delimiter.Each line line contains nearly 3000 characters.
I want to delete the lines
a> if it starts with 1 and if 576th postion is a digit i,e 0-9
or
b> if it starts with 0 or 9(i,e header and footer)
... (4 Replies)
Discussion started by: millan
4 Replies
5. Shell Programming and Scripting
Hi,
I have a file which has many URLs delimited by space. Now i want them to move to separate files each one holding 10 URLs per file.
http://3276.e-printphoto.co.uk/guardian http://abdera.apache.org/ http://abdera.apache.org/docs/api/index.html
I have used the below code to arrange... (6 Replies)
Discussion started by: vel4ever
6 Replies
6. UNIX for Dummies Questions & Answers
Hi,
I have a Huge 7 GB file which has around 1 million records, i want to split this file into 4 files to contain around 250k messages each.
Please help me as Split command cannot work here as it might miss tags..
Format of the file is as below
<!--###### ###### START-->... (6 Replies)
Discussion started by: KishM
6 Replies
7. Shell Programming and Scripting
We have a folder XYZ with large number of files (>350,000). how can i split the folder and create say 10 of them XYZ1 to XYZ10 with 35,000 files each. (doesnt matter which files go where). (12 Replies)
Discussion started by: AlokKumbhare
12 Replies
8. UNIX for Advanced & Expert Users
I have 2 large file (.dat) around 70 g, 12 columns but the data not sorted in both the files.. need your inputs in giving the best optimized method/command to achieve this and redirect the not macthing lines to the thrid file ( diff.dat)
File 1 - 15 columns
File 2 - 15 columns
Data is... (9 Replies)
Discussion started by: kartikirans
9 Replies
9. UNIX for Beginners Questions & Answers
I have a large semicolon delimited file with thousands of columns and many thousands of line. It looks like:
ID1;ID2;ID3;ID4;A_1;B_1;C_1;A_2;B_2;C_2;A_3;B_3;C_3
AA;ax;ay;az;01;02;03;04;05;06;07;08;09
BB;bx;by;bz;03;05;33;44;15;26;27;08;09
I want to split this table in to multiple files:
... (1 Reply)
Discussion started by: trymega
1 Replies