Need to Extract Data From 94000 records


 
Thread Tools Search this Thread
Top Forums Shell Programming and Scripting Need to Extract Data From 94000 records
# 1  
Old 04-13-2009
Data Need to Extract Data From 94000 records

i have a input file which does not have a delimiter
All i Need to do is to identify a line and extract the data from it and run the loop again and need to ensure that it was not extracted earlier


Input file
------------
abcd 12345 egfhijk ip 192.168.0.1 CNN.com
abcd 12345 egfhijk ip 192.168.0.1 hotmail.com
abcd 12345 egfhijk STARTTLS=server ip 192.168.0.1 hotmail.com
asjdh khasdhsdf kshdfjksdhfk STARTTLS=server ip 192.168.2.43 abc.yahoo.com
sdfhh jkadfhh jasdfhjkasdhf STARTTLS=server ip 62.43.65.45 abc.cool.com
abcd 12345 egfhijk STARTTLS=server ip 192.168.0.1 hotmail.com
asjdh khasdhsdf kshdfjksdhfk STARTTLS=server ip 192.168.2.43 abc.yahoo.com
sdfhh jkadfhh jasdfhjkasdhf STARTTLS=server ip 62.43.65.45 abc.cool.com

---------

Output File
-----------
hotmail.com
yahoo.com
cool.com


you can see that the data would be repeated in the input file still i need to ensure that data is not repeated in the output file and only those lines are checked where we have STARTTLS=server

Note the input file is approx 35 MB
# 2  
Old 04-13-2009
What have you tried?
# 3  
Old 04-13-2009
hope its not a home workSmilie
Code:
awk '/STARTTLS/{print $NF}' filename|sort -u

# 4  
Old 04-14-2009
perl:

Code:
open $fh,"<","a.txt";
while(<$fh>){
	if (/STARTTLS=server/){
		my @arr=split(" ",$_) ;
		my @brr=split("[.]",$arr[$#arr]);
		my $temp=sprintf("%s.%s",$brr[$#brr-1],$brr[$#brr]);
		$hash{$temp}=1;
	}
}
print join "\n", sort keys %hash;

# 5  
Old 04-14-2009
Quote:
Originally Posted by summer_cherry
perl:

Code:
open $fh,"<","a.txt";
while(<$fh>){
	if (/STARTTLS=server/){
		my @arr=split(" ",$_) ;
		my @brr=split("[.]",$arr[$#arr]);
		my $temp=sprintf("%s.%s",$brr[$#brr-1],$brr[$#brr]);
		$hash{$temp}=1;
	}
}
print join "\n", sort keys %hash;

You're a jerk.
# 6  
Old 04-14-2009
Quote:
Originally Posted by KevinADC
You're a jerk.
relax.

Code:
while (<>){
 if ( /STARTTLS/ ){
   @list=split /\s/,$_;
   $hash{$list[-1]}++;
 }
}
foreach my $k ( sort( keys(%hash) ) ){print $k."\n";}


Last edited by ghostdog74; 04-14-2009 at 01:35 AM..
# 7  
Old 04-14-2009
Quote:
Originally Posted by ghostdog74
relax.
Summer cherry is a jerk, and I am perfectly relaxed.
Login or Register to Ask a Question

Previous Thread | Next Thread

10 More Discussions You Might Find Interesting

1. Shell Programming and Scripting

Extract Matched Records from XML

Hi All, I have a requirement to extract para in XML file on the basis of another list file having specific parameters. I will extract these para from XML and import in one scheduler tool. file2 <FOLDER DATACENTER="ControlMserver" VERSION="800" PLATFORM="UNIX" FOLDER_NAME="SH_AP_INT_B01"... (3 Replies)
Discussion started by: looney
3 Replies

2. Shell Programming and Scripting

Extract UNIque records from File

Hi, I have a file with 20GB Pipe Delimited file where i have too many duplicate records. I need an awk script to extract the unique records from the file and put it into another file. Kindly help. Thanks, Arun (1 Reply)
Discussion started by: Arun Mishra
1 Replies

3. Shell Programming and Scripting

Extract records from list

Hi Gents, I have a file 1 like this 1 1000 20 2 2000 30 3 1000 40 5 1000 50 And I have other file 1 like 2 1 I would like to get from the file 1 the complete line which are in file 2, the key to compare is the column 2 then output should be. 2 2000 30. I was trying to get it... (5 Replies)
Discussion started by: jiam912
5 Replies

4. Shell Programming and Scripting

ksh coding to extract records from file

Hello, I have a file with various records in it (from length 30 - 195) and I want to run a script to read each line and copy only the recl=80 files to an output file. Any help much appreciated (4 Replies)
Discussion started by: Grueben
4 Replies

5. Shell Programming and Scripting

Compare Records between to files and extract it

I am not an expert in awk, SED, etc... but I really hope there is a way to do this, because I don't want to have to right a program. I am using C shell. FILE 1 FILE 2 H0000000 H0000000 MA1 MA1 CA1DDDDDD CA1AAAAAA MA2 ... (2 Replies)
Discussion started by: jclanc8
2 Replies

6. Shell Programming and Scripting

Extract data from records that match pattern

Hi Guys, I have a file as follows: a b c 1 2 3 4 pp gg gh hh 1 2 fm 3 4 g h i j k l m 1 2 3 4 d e f g h j i k l 1 2 3 f 3 4 r t y u i o p d p re 1 2 3 f 4 t y w e q w r a s p a 1 2 3 4 I am trying to extract all the 2's from each row. 2 is just an example... (6 Replies)
Discussion started by: npatwardhan
6 Replies

7. Shell Programming and Scripting

Extract CSV records using NAWK?

Example CSV: $ cat myfile HDR COL_A,COL_B,COL_C X,Y,Z Z,Y,X ... X,W,Z In this example, I know that column names are on the second line. I also know that I would like to print lines where COL_A="X" and COL_C="Z". In this simple example, I know that COL_A = $1 and COL_C = $3, and hence... (6 Replies)
Discussion started by: cs03dmj
6 Replies

8. Shell Programming and Scripting

Extract data based on match against one column data from a long list data

My input file: data_5 Ali 422 2.00E-45 102/253 140/253 24 data_3 Abu 202 60.00E-45 12/23 140/23 28 data_1 Ahmad 256 7.00E-45 120/235 140/235 22 data_4 Aman 365 8.00E-45 15/65 140/65 20 data_10 Jones 869 9.00E-45 65/253 140/253 18... (12 Replies)
Discussion started by: patrick87
12 Replies

9. Shell Programming and Scripting

Extract data from large file 80+ million records

Hello, I have got one file with more than 120+ million records(35 GB in size). I have to extract some relevant data from file based on some parameter and generate other output file. What will be the besat and fastest way to extract the ne file. sample file format :--... (2 Replies)
Discussion started by: learner16s
2 Replies

10. Shell Programming and Scripting

extract set of matching records

i have a pipe delimited file with records spread in many lines. i need to extract those records 1)having X in beginning of that record 2)and having at least one Y in beginning before other record begins eg: X|Rec1| A|Rec1| Y|Rec1| X|Rec2| Y|Rec2| Z|Rec3| X|Rec4| M|Rec4| ... (4 Replies)
Discussion started by: finder255
4 Replies
Login or Register to Ask a Question