Sponsored Content
Top Forums Shell Programming and Scripting Help- counting delimiter in a huge file and split data into 2 files Post 302497304 by lv99 on Wednesday 16th of February 2011 11:07:22 PM
Old 02-17-2011
Help- counting delimiter in a huge file and split data into 2 files

I’m new to Linux script and not sure how to filter out bad records from huge flat files (over 1.3GB each). The delimiter is a semi colon “;”

Here is the sample of 5 lines in the file:

Name1;phone1;address1;city1;state1;zipcode1
Name2;phone2;address2;city2;state2;zipcode2;comment
Name3;phone3;address3;city3;state3;zipcode3
Name4;phone4;address4;city4;state4;zipcode4
Name5;phone5;address5


I need a script to read each line and count the number of ; on each line

If delimiter counts = 5 Then
Write that line to goodfile1
Else
Write bad line to rejectedfile1.

The result of two output files should look like this

goodfile1 has:

Name1;phone1;address1;city1;state1;zipcode1
Name3;phone3;address3;city3;state3;zipcode3
Name4;phone4;address4;city4;state4;zipcode4


rejectedfile1 has:

Name2;phone2;address2;city2;state2;zipcode2;comment
Name5;phone5;address5


Thanks
 

9 More Discussions You Might Find Interesting

1. Shell Programming and Scripting

Split a huge data into few different files?!

Input file data contents: >seq_1 MSNQSPPQSQRPGHSHSHSHSHAGLASSTSSHSNPSANASYNLNGPRTGGDQRYRASVDA >seq_2 AGAAGRGWGRDVTAAASPNPRNGGGRPASDLLSVGNAGGQASFASPETIDRWFEDLQHYE >seq_3 ATLEEMAAASLDANFKEELSAIEQWFRVLSEAERTAALYSLLQSSTQVQMRFFVTVLQQM ARADPITALLSPANPGQASMEAQMDAKLAAMGLKSPASPAVRQYARQSLSGDTYLSPHSA... (7 Replies)
Discussion started by: patrick87
7 Replies

2. Shell Programming and Scripting

Perl script error to split huge data one by one.

Below is my perl script: #!/usr/bin/perl open(FILE,"$ARGV") or die "$!"; @DATA = <FILE>; close FILE; $join = join("",@DATA); @array = split( ">",$join); for($i=0;$i<=scalar(@array);$i++){ system ("/home/bin/./program_name_count_length MULTI_sequence_DATA_FILE -d... (5 Replies)
Discussion started by: patrick87
5 Replies

3. Shell Programming and Scripting

renaming files using split with a delimiter

I have a directory of files that I need to rename by splitting the first and second halves of the filenames using the delimiter "-O" and then renaming with the second half first, followed by two underscores and then the first half. For example, natfinal1995annvol1_14.pdf -O filenum-20639 will be... (2 Replies)
Discussion started by: swimulator
2 Replies

4. Shell Programming and Scripting

Implement in one line sed or awk having no delimiter and file size is huge

I have file which contains around 5000 lines. The lines are fixed legth but having no delimiter.Each line line contains nearly 3000 characters. I want to delete the lines a> if it starts with 1 and if 576th postion is a digit i,e 0-9 or b> if it starts with 0 or 9(i,e header and footer) ... (4 Replies)
Discussion started by: millan
4 Replies

5. Shell Programming and Scripting

Split file into multiple files using delimiter

Hi, I have a file which has many URLs delimited by space. Now i want them to move to separate files each one holding 10 URLs per file. http://3276.e-printphoto.co.uk/guardian http://abdera.apache.org/ http://abdera.apache.org/docs/api/index.html I have used the below code to arrange... (6 Replies)
Discussion started by: vel4ever
6 Replies

6. UNIX for Dummies Questions & Answers

Split a huge 7 GB File Based on Pattern into 4 files

Hi, I have a Huge 7 GB file which has around 1 million records, i want to split this file into 4 files to contain around 250k messages each. Please help me as Split command cannot work here as it might miss tags.. Format of the file is as below <!--###### ###### START-->... (6 Replies)
Discussion started by: KishM
6 Replies

7. Shell Programming and Scripting

Split a folder with huge number of files in n folders

We have a folder XYZ with large number of files (>350,000). how can i split the folder and create say 10 of them XYZ1 to XYZ10 with 35,000 files each. (doesnt matter which files go where). (12 Replies)
Discussion started by: AlokKumbhare
12 Replies

8. UNIX for Advanced & Expert Users

File comaprsons for the Huge data files ( around 60G) - Need optimized and teh best way to do this

I have 2 large file (.dat) around 70 g, 12 columns but the data not sorted in both the files.. need your inputs in giving the best optimized method/command to achieve this and redirect the not macthing lines to the thrid file ( diff.dat) File 1 - 15 columns File 2 - 15 columns Data is... (9 Replies)
Discussion started by: kartikirans
9 Replies

9. UNIX for Beginners Questions & Answers

Shell script to Split matrix file with delimiter into multiple files

I have a large semicolon delimited file with thousands of columns and many thousands of line. It looks like: ID1;ID2;ID3;ID4;A_1;B_1;C_1;A_2;B_2;C_2;A_3;B_3;C_3 AA;ax;ay;az;01;02;03;04;05;06;07;08;09 BB;bx;by;bz;03;05;33;44;15;26;27;08;09 I want to split this table in to multiple files: ... (1 Reply)
Discussion started by: trymega
1 Replies
pydhcplib.ipv4(3)						     PYDHCPLIB							 pydhcplib.ipv4(3)

NAME
pydhcplib.ipv4 - Type for IP addresses version 4 SYNOPSIS
from pydhcplib.type_ipv4 import ipv4 a = ipv4() a = ipv4(string) a = ipv4(strlist) a = ipv4(int) DESCRIPTION
The class pydhcplib.ipv4 is a type "IP address version 4". It's used for string processing like "192.168.0.4". The class creation argument can be a string like "192.168.0.4". The class creation argument can be a list of bytes like [192,168,0,4]. METHODS
The implemented methods in this class are mostly methods of comparison (= =, >, etc...) else : str() return data converted into a printable string. list() return data converted into a list of bytes. int() return data converted into an 4 bytes int. EXAMPLES
Example program ipv4_example.py : from pydhcplib.type_ipv4 import ipv4 address = ipv4() address1 = ipv4("192.168.0.1") address2 = ipv4("10.0.0.1") address3 = ipv4([192,168,0,1]) print "a0 : ",address print "a1 : ",address1 print "a2 : ",address2 print "a3 : ",address3 if address1 == address2 : print "test 1 : ",address1, "==",address2 else : print "test 1 : " ,address1, "!=",address2 if address1 == address3 : print "test 2 : ", address1, "==",address3 else : print "test 2 : ", address1, "!=",address3 SEE ALSO
pydhcp(8), pydhcplib.hwmac(3), pydhcplib.ipv4(3), pydhcplib.strlist(3), pydhcplib.DhcpPacket(3), pydhcplib.DhcpBasicPacket(3), pydhc- plib.DhcpNetwork(3), pydhcplib.DhcpClient(3), pydhcplib.DhcpRawClient(3), pydhcplib.DhcpDerver(3) BUGS
See http://pydhcplib.tuxfamily.org/ for more information. AUTHOR
Mathieu Ignacio (mignacio[AT]april.org) pydhcplib.ipv4(3)
All times are GMT -4. The time now is 09:05 AM.
Unix & Linux Forums Content Copyright 1993-2022. All Rights Reserved.
Privacy Policy