Script to read file and extract data by matching pattern


 
Thread Tools Search this Thread
Top Forums Shell Programming and Scripting Script to read file and extract data by matching pattern
# 1  
Old 06-02-2011
Data Script to read file and extract data by matching pattern

Hello,
I have a file ( say file1) which has lines like below.
Code:
xxxx:xxxx,yyyy,1234,efgh
zzzz:zzzz,kkkk,pppp,1234,xxxx,uuuu,oooo
dddd:dddd

here the word before ":" ( ie: xxxx) is the file name and the string after : are also file names, but each file name separated by ","
In case of 1st line- xxxx has dependency on files xxxx,yyyy,1234,efgh
In case of second line - zzzz has dependency on kkkk,pppp,1234,xxxx,uuuu,oooo files.

each file can have No dependency ( like line 3 ) or more and the first file name dependent is always itself.

I have another file( say file2) which has only the master file names with Path which looks like
Code:
/data/testing2/zzzz
/data/testing1/xxxx
/data/test/dddd
/data/test3/fffff

now i want create another file ( say file3 "output" ) which has all the file names on separate lines ( with Path if possible) of all dependent files for each of the files in file2
so file3 once the script is run should look some thing like this for ( file1 and file2 as input)
Code:
/data/testing1/kkkk
/data/test1/pppp
/data/testing1/1234
/data/test1/xxxx
/data/test1/uuuu
/data/testing3/oooo
/data/testing1/yyyy
/data/test/1234
/data/post/efgh

I would be happy if some one could help me with a script to process this.

Last edited by Franklin52; 06-02-2011 at 10:33 AM.. Reason: Please use code tags
# 2  
Old 06-02-2011
as per your input files, the output of the below is wrong

/data/testing1/kkkk

it must be

/data/testing2/kkkk as kkkk is depends on the zzzz and zzzz path is /data/testing2/

please clearly mention your input and output format

I am confused Smilie
# 3  
Old 06-02-2011
This is what are you looking for ?
Code:
 awk -F"/" 'NR==FNR{for(i=1;i<NF;i++){a[$NF]=a[$NF] $i FS}} a[$1]{for(i=2;i<=NF;i++){print a[$1] $i}}' file2 FS="[:,]" file1

# 4  
Old 06-02-2011
@ itkamaraj -- the path of the dependent file can be any where. basically it has to find the dependent file under /data, it can be under /data/testing2/kkkk or /data/testing1/kkkk.
Sorry if Im confusing you more :-) . let me know if you need more clarification.

---------- Post updated at 12:38 PM ---------- Previous update was at 12:25 PM ----------

Pravin Thanks for the quick reply, the script works fine, but the path of the dependent file is same as the master file. But in my case the dependent files can be in any location under /data ( for egSmilie.How do we go about this.
or i can maintain another file ( say file4) which has file names and path of all files. so after getting the dependent file ( from file1) i can just find this file name in file4 and just pick that line with path as output.
Kindly suggest
# 5  
Old 06-02-2011
If you want to find out the files only then what is use of file2. Could the below code helps you ?
Code:
perl -nle '@flds=split(",",substr($_,index($_,":")+1));for($i=0;$i<=$#flds;$i++){system("find /data -name $flds[$i]");}' file1

# 6  
Old 06-02-2011
Thanks pravin - file1 has list of all files and dependencies ( its kinda master file), and file2 has only a subset of files that i require from file1.
for instance file1 has 100 files ( lines) and their dependencies, but i require only 10files and their dependencies (that is mentioned in file2 ) to be found/ listed.
# 7  
Old 06-02-2011
Try this,
FindFile.pl
Code:
#!/usr/bin/perl

($file1,$file2)=($ARGV[0],$ARGV[1]);
open (FH1,"<","$file1") or die "Fail- $!\n";
open (FH2,"<","$file2") or die "Fail- $!\n";

while (<FH2>) {
chomp;
@flds=split(/\//);
$lookup{$flds[$#flds]}++;
}

while (<FH1>) {
chomp;
if(/^(.+?):/) {
        if ( exists $lookup{$1}) {
                @flds=split(/,/,substr($_,index($_,":")+1));
                for($i=0;$i<=$#flds;$i++) { system("find /data -name $flds[$i]"); }
              }
        }
}

close(FH1);
close(FH2);

Invocation
Code:
perl FindFile.pl file1 file2

Login or Register to Ask a Question

Previous Thread | Next Thread

10 More Discussions You Might Find Interesting

1. Shell Programming and Scripting

In PErl script: need to read the data one file and generate multiple files based on the data

We have the data looks like below in a log file. I want to generat files based on the string between two hash(#) symbol like below Source: #ext1#test1.tale2 drop #ext1#test11.tale21 drop #ext1#test123.tale21 drop #ext2#test1.tale21 drop #ext2#test12.tale21 drop #ext3#test11.tale21 drop... (5 Replies)
Discussion started by: Sanjeev G
5 Replies

2. Shell Programming and Scripting

Extract range from config file matching pattern

I have config file like this: server_name xx opt1 opt2 opt3 suboptions1 #suboptions - disabled suboptions2 pattern suboptions3 server_name yy opt1 opt2 opt3 suboptions1 pattern #suboptions - disabled suboptions2 So basically I want to extract the server... (1 Reply)
Discussion started by: nemesis911
1 Replies

3. Shell Programming and Scripting

IN Between Data after matching the Pattern

HI , I WANT TO RETRIVE IN BETWEEN DATA FROM PARENTHESIS AND I AM GETTING ERRORS WHILE RUN THE AWK.I HAVE 2 FILES AND WANT TO PROCESS 1ST FILE PATTERN TO 2ND FILE AND WRITES INTO OUTPUT FILE.THIS TIME I AM PUTTING WHERE EXACTLY I AM GETTING ERRORS.SO PLEASE HELP. PATTERN_FILE.TXT --------------... (1 Reply)
Discussion started by: andrew_11
1 Replies

4. UNIX for Beginners Questions & Answers

Shell - Read a text file with two words and extract data

hi I made this simple script to extract data and pretty much is a list and would like to extract data of two words separated by commas and I would like to make a new text file that would list these extracted data into a list and each in a new line. Example that worked for me with text file... (5 Replies)
Discussion started by: dandaryll
5 Replies

5. Shell Programming and Scripting

Matching and extract data from a file

Gents, Matching columns 1-19 in file1 and 20-38 in file 2, I would like to extract the data in the same order of file2. file1 X 7494 11511 44149.00 48617.002 1 4321 44148.00 48198.00 49060.001 X 7494 11511 44149.00 48617.002 433 8641 44160.00 48198.00 49060.001 ... (10 Replies)
Discussion started by: jiam912
10 Replies

6. Shell Programming and Scripting

HELP: Shell Script to read a Log file line by line and extract Info based on KEYWORDS matching

I have a LOG file which looks like this Import started at: Mon Jul 23 02:13:01 EDT 2012 Initialization completed in 2.146 seconds. -------------------------------------------------------------------------------- -- Import summary for Import item: PolicyInformation... (8 Replies)
Discussion started by: biztank
8 Replies

7. Shell Programming and Scripting

Want to read data from a file name.txt and search it in another file and then matching...

Hi Frnds... I have an input file name.txt and another file named as source.. name.txt is having only one column and source is having around 25 columns...i need to read from name.txt line by line and search it in source file and then save the result in results file.. I have a rough idea about the... (15 Replies)
Discussion started by: ektubbe
15 Replies

8. Shell Programming and Scripting

Removing data with pattern matching

I have the following: HH:MM:SS I want to use either % or # sign to remove :SS can somebody please provide me an example. I know how to do this in awk, but awk is too much overhead for something this simple since I will be doing this in a loop a lot of times. Thanks in advance to all... (2 Replies)
Discussion started by: BeefStu
2 Replies

9. Shell Programming and Scripting

help needed .. Unable to write the data to new file after matching the pattern

Hi, i am pretty new to Unix environment ..... Can i get some help from any of you guyz on writing Unix script. my requirement is like reading a csv file, finding a specific pattern in the lines and repalce the string with new string and write it to another file. My file is file ABC123.dat... (3 Replies)
Discussion started by: prashant_jsw
3 Replies

10. Shell Programming and Scripting

Script to find file name for non matching pattern

Hi, I want to list only the file names which do not contain a specific keyword or search string. OS: Solaris Also is there any way ; through the same script I can save the output of search to a CSV (comma seperated) so that the file can be used for inventory purpose. Any assistance will... (5 Replies)
Discussion started by: sujoy101
5 Replies
Login or Register to Ask a Question