Sponsored Content
Top Forums Shell Programming and Scripting Help- counting delimiter in a huge file and split data into 2 files Post 302497304 by lv99 on Wednesday 16th of February 2011 11:07:22 PM
Old 02-17-2011
Help- counting delimiter in a huge file and split data into 2 files

I’m new to Linux script and not sure how to filter out bad records from huge flat files (over 1.3GB each). The delimiter is a semi colon “;”

Here is the sample of 5 lines in the file:

Name1;phone1;address1;city1;state1;zipcode1
Name2;phone2;address2;city2;state2;zipcode2;comment
Name3;phone3;address3;city3;state3;zipcode3
Name4;phone4;address4;city4;state4;zipcode4
Name5;phone5;address5


I need a script to read each line and count the number of ; on each line

If delimiter counts = 5 Then
Write that line to goodfile1
Else
Write bad line to rejectedfile1.

The result of two output files should look like this

goodfile1 has:

Name1;phone1;address1;city1;state1;zipcode1
Name3;phone3;address3;city3;state3;zipcode3
Name4;phone4;address4;city4;state4;zipcode4


rejectedfile1 has:

Name2;phone2;address2;city2;state2;zipcode2;comment
Name5;phone5;address5


Thanks
 

9 More Discussions You Might Find Interesting

1. Shell Programming and Scripting

Split a huge data into few different files?!

Input file data contents: >seq_1 MSNQSPPQSQRPGHSHSHSHSHAGLASSTSSHSNPSANASYNLNGPRTGGDQRYRASVDA >seq_2 AGAAGRGWGRDVTAAASPNPRNGGGRPASDLLSVGNAGGQASFASPETIDRWFEDLQHYE >seq_3 ATLEEMAAASLDANFKEELSAIEQWFRVLSEAERTAALYSLLQSSTQVQMRFFVTVLQQM ARADPITALLSPANPGQASMEAQMDAKLAAMGLKSPASPAVRQYARQSLSGDTYLSPHSA... (7 Replies)
Discussion started by: patrick87
7 Replies

2. Shell Programming and Scripting

Perl script error to split huge data one by one.

Below is my perl script: #!/usr/bin/perl open(FILE,"$ARGV") or die "$!"; @DATA = <FILE>; close FILE; $join = join("",@DATA); @array = split( ">",$join); for($i=0;$i<=scalar(@array);$i++){ system ("/home/bin/./program_name_count_length MULTI_sequence_DATA_FILE -d... (5 Replies)
Discussion started by: patrick87
5 Replies

3. Shell Programming and Scripting

renaming files using split with a delimiter

I have a directory of files that I need to rename by splitting the first and second halves of the filenames using the delimiter "-O" and then renaming with the second half first, followed by two underscores and then the first half. For example, natfinal1995annvol1_14.pdf -O filenum-20639 will be... (2 Replies)
Discussion started by: swimulator
2 Replies

4. Shell Programming and Scripting

Implement in one line sed or awk having no delimiter and file size is huge

I have file which contains around 5000 lines. The lines are fixed legth but having no delimiter.Each line line contains nearly 3000 characters. I want to delete the lines a> if it starts with 1 and if 576th postion is a digit i,e 0-9 or b> if it starts with 0 or 9(i,e header and footer) ... (4 Replies)
Discussion started by: millan
4 Replies

5. Shell Programming and Scripting

Split file into multiple files using delimiter

Hi, I have a file which has many URLs delimited by space. Now i want them to move to separate files each one holding 10 URLs per file. http://3276.e-printphoto.co.uk/guardian http://abdera.apache.org/ http://abdera.apache.org/docs/api/index.html I have used the below code to arrange... (6 Replies)
Discussion started by: vel4ever
6 Replies

6. UNIX for Dummies Questions & Answers

Split a huge 7 GB File Based on Pattern into 4 files

Hi, I have a Huge 7 GB file which has around 1 million records, i want to split this file into 4 files to contain around 250k messages each. Please help me as Split command cannot work here as it might miss tags.. Format of the file is as below <!--###### ###### START-->... (6 Replies)
Discussion started by: KishM
6 Replies

7. Shell Programming and Scripting

Split a folder with huge number of files in n folders

We have a folder XYZ with large number of files (>350,000). how can i split the folder and create say 10 of them XYZ1 to XYZ10 with 35,000 files each. (doesnt matter which files go where). (12 Replies)
Discussion started by: AlokKumbhare
12 Replies

8. UNIX for Advanced & Expert Users

File comaprsons for the Huge data files ( around 60G) - Need optimized and teh best way to do this

I have 2 large file (.dat) around 70 g, 12 columns but the data not sorted in both the files.. need your inputs in giving the best optimized method/command to achieve this and redirect the not macthing lines to the thrid file ( diff.dat) File 1 - 15 columns File 2 - 15 columns Data is... (9 Replies)
Discussion started by: kartikirans
9 Replies

9. UNIX for Beginners Questions & Answers

Shell script to Split matrix file with delimiter into multiple files

I have a large semicolon delimited file with thousands of columns and many thousands of line. It looks like: ID1;ID2;ID3;ID4;A_1;B_1;C_1;A_2;B_2;C_2;A_3;B_3;C_3 AA;ax;ay;az;01;02;03;04;05;06;07;08;09 BB;bx;by;bz;03;05;33;44;15;26;27;08;09 I want to split this table in to multiple files: ... (1 Reply)
Discussion started by: trymega
1 Replies
Address(3pm)                                            User Contributed Perl Documentation                                           Address(3pm)

NAME
Palm::Address - Handler for Palm AddressBook databases SYNOPSIS
use Palm::Address; DESCRIPTION
The Address PDB handler is a helper class for the Palm::PDB package. It parses AddressBook databases. AppInfo block The AppInfo block begins with standard category support. See Palm::StdAppInfo for details. Other fields include: $pdb->{appinfo}{lastUniqueID} $pdb->{appinfo}{dirtyFields} I don't know what these are. $pdb->{appinfo}{fieldLabels}{name} $pdb->{appinfo}{fieldLabels}{firstName} $pdb->{appinfo}{fieldLabels}{company} $pdb->{appinfo}{fieldLabels}{phone1} $pdb->{appinfo}{fieldLabels}{phone2} $pdb->{appinfo}{fieldLabels}{phone3} $pdb->{appinfo}{fieldLabels}{phone4} $pdb->{appinfo}{fieldLabels}{phone5} $pdb->{appinfo}{fieldLabels}{phone6} $pdb->{appinfo}{fieldLabels}{phone7} $pdb->{appinfo}{fieldLabels}{phone8} $pdb->{appinfo}{fieldLabels}{address} $pdb->{appinfo}{fieldLabels}{city} $pdb->{appinfo}{fieldLabels}{state} $pdb->{appinfo}{fieldLabels}{zipCode} $pdb->{appinfo}{fieldLabels}{country} $pdb->{appinfo}{fieldLabels}{title} $pdb->{appinfo}{fieldLabels}{custom1} $pdb->{appinfo}{fieldLabels}{custom2} $pdb->{appinfo}{fieldLabels}{custom3} $pdb->{appinfo}{fieldLabels}{custom4} $pdb->{appinfo}{fieldLabels}{note} These are the names of the various fields in the address record. $pdb->{appinfo}{country} An integer: the code for the country for which these labels were designed. The country name is available as $Palm::Address::countries[$pdb->{appinfo}{country}]; $pdb->{appinfo}{misc} An integer. The least-significant bit is a flag that indicates whether the database should be sorted by company. The other bits are reserved. Sort block $pdb->{sort} This is a scalar, the raw data of the sort block. Records $record = $pdb->{records}[N]; $record->{fields}{name} $record->{fields}{firstName} $record->{fields}{company} $record->{fields}{phone1} $record->{fields}{phone2} $record->{fields}{phone3} $record->{fields}{phone4} $record->{fields}{phone5} $record->{fields}{address} $record->{fields}{city} $record->{fields}{state} $record->{fields}{zipCode} $record->{fields}{country} $record->{fields}{title} $record->{fields}{custom1} $record->{fields}{custom2} $record->{fields}{custom3} $record->{fields}{custom4} $record->{fields}{note} These are scalars, the values of the various address book fields. $record->{phoneLabel}{phone1} $record->{phoneLabel}{phone2} $record->{phoneLabel}{phone3} $record->{phoneLabel}{phone4} $record->{phoneLabel}{phone5} Most fields in an AddressBook record are straightforward: the "name" field always gives the person's last name. The "phoneN" fields, on the other hand, can mean different things in different records. There are five such fields in each record, each of which can take on one of eight different values: "Work", "Home", "Fax", "Other", "E-mail", "Main", "Pager" and "Mobile". The $record->{phoneLabel}{phone*} fields are integers. Each one is an index into @Palm::Address::phoneLabels, and indicates which particular type of phone number each of the $record->{phone*} fields represents. $record->{phoneLabel}{display} Like the phone* fields above, this is an index into @Palm::Address::phoneLabels. It indicates which of the phone* fields to display in the list view. $record->{phoneLabel}{reserved} I don't know what this is. METHODS
new $pdb = new Palm::Address; Create a new PDB, initialized with the various Palm::Address fields and an empty record list. Use this method if you're creating an Address PDB from scratch. new_Record $record = $pdb->new_Record; Creates a new Address record, with blank values for all of the fields. The AppInfo block will contain only an "Unfiled" category, with ID 0. "new_Record" does not add the new record to $pdb. For that, you want "$pdb->append_Record". SOURCE CONTROL
The source is in Github: http://github.com/briandfoy/p5-Palm/tree/master AUTHOR
Alessandro Zummo, "<a.zummo@towertech.it>" Currently maintained by brian d foy, "<bdfoy@cpan.org>" SEE ALSO
Palm::PDB(3) Palm::StdAppInfo(3) BUGS
The new() method initializes the AppInfo block with English labels and "United States" as the country. perl v5.10.1 2010-02-23 Address(3pm)
All times are GMT -4. The time now is 02:51 AM.
Unix & Linux Forums Content Copyright 1993-2022. All Rights Reserved.
Privacy Policy