Extract data according to keys from filename mentioned in file


 
Thread Tools Search this Thread
Top Forums Shell Programming and Scripting Extract data according to keys from filename mentioned in file
# 1  
Old 03-03-2015
Extract data according to keys from filename mentioned in file

Hello experts, I want to join a file with files whosE names are mentioned in one of the columns of the same file.
File 1

Code:
t1,a,b,file number 1
t1,a,c,file number 1
t2,c,d,file number 2
t2,c,e,file number 2
t2,c,f,file number 2
t2,c,g,file number 2
t3,e,f,file number 3

file number 1
Code:
t,h1,h2,c1,c2,c3
t1,a,b,v1,v2,v3
t1,x,y,g1,g2,g3
t1,a,b,v4,v5,v6
t1,x,y,h1,h2,h3
t1,a,b,v7,v8,v9
t1,a c,t1,t2,t3
t1,a,c,t2,t3,t4


file number 2
Code:
t,h1,h2,c11,c12,c13
t2,c,d,v11,v12,v13
t2,x,y,g11,g12,g13
t2,c,e,v14,v15,v16
t2,c,f,h11,h12,h13
t2,c,g,v17,v18,v19


file number 3
Code:
t,h1,h2,c12,c22,c23
t3,e,f,v31,v32,v33
t3,x,y,g11,g12,g13
t3,e,f,v34,v35,v36
t3,e,f,h31,h32,h33
t3,e,f,v37,v38,v39


I want to read each line of File, make a key of cols 2 and 3, and then extract data for the key from the filename mentioned in the 4th column. The filenames have spaces as I have shown. Also I want the headers in the output.

out_t1
Code:
t,h1,h2,c1,c2,c3
t1,a,b,v1,v2,v3
t1,a,b,v4,v5,v6
t1,a,b,v7,v8,v9
t1,a c,t1,t2,t3
t1,a,c,t2,t3,t4


out_t2
Code:
t,h1,h2,c11,c12,c13
t2,c,d,v11,v12,v13
t2,c,e,v14,v15,v16
t2,c,f,h11,h12,h13
t2,c,g,v17,v18,v19


out_t3
Code:
t,h1,h2,c12,c22,c23
t3,e,f,v31,v32,v33
t3,e,f,v34,v35,v36
t3,e,f,h31,h32,h33
t3,e,f,v37,v38,v39

This is what I want to do:-
Code:
awk -F, 'NR==FNR {k[$2$3]=$4;next} $2$3 in k { print }' OFS="," file1  $4  >>  out_$1

Not sure how to take out the 4th column and 1st column from first file, also to extract the header.

Please assist.

Last edited by Scrutinizer; 03-03-2015 at 03:23 PM.. Reason: Changed unclear code tags to one set of code tags per sample file
# 2  
Old 03-03-2015
I am assuming there is a missing comma at line 7 in file number one. Try:

Code:
awk -F, '
  NR==FNR {
    A[$0]
    next
  }
  FNR==1 {
    close(f)
    f="out_t" ++c
  }
  FNR==1 || $1 FS $2 FS $3 FS FILENAME in A {
    print>f
  }
' "file 1" "file number "*

This User Gave Thanks to Scrutinizer For This Post:
# 3  
Old 03-03-2015
Try also
Code:
awk     'FNR==1         {FNo++
                         if (FNo < 4) print > "out_t" substr(FILENAME, length(FILENAME))}

         FNo < 4        {DT[NR]=$0","FILENAME
                         MX=NR
                         next 
                        }

                        {for (i=1; i<=MX; i++)
                                if (DT[i] ~ $2"[, ]"$3".*"$NF"$")
                                        {sub (/,[^,]*$/, "", DT[i])
                                         print DT[i] > "out_t" substr ($4, length($4))
                                        }
                        }
        ' FS="," "file number "[123] file1

This User Gave Thanks to RudiC For This Post:
# 4  
Old 03-03-2015
Quote:
Originally Posted by Scrutinizer
I am assuming there is a missing comma at line 7 in file number one.
Many thanks, and sorry about the missing comma.. Your code works great with the example files. My actual file names do not have pattern like "File number N", can I just point to the directory and replace the "file number "* by the absolute path of the directory like "./*" ?

Also, the output file name is generated from the first column of File 1, so I have modified your autoincrement file name to pick up from the first column..

Please let me know if you see any syntax or logical mistakes.

Code:
awk -F, '
  NR==FNR {
    A[$0]
    B[$2 FS $3]=$1
    next
  }
  FNR==1 || $1 FS $2 FS $3 FS FILENAME in A {
    print> out_B[$2 FS $3]
  }
' "file 1" "/DIRpath/*"

Login or Register to Ask a Question

Previous Thread | Next Thread

10 More Discussions You Might Find Interesting

1. UNIX for Beginners Questions & Answers

How to extract when filename contains file seperator..?

Hi, I want to extract part of filename, for eg: File="010020004_S-TOR-Sort-CASAP_20170519_121504_0007.TXT" here i need first 5 words of file i.e. FilePart="$(echo "${File%"${File#******}"}")" Echo $FilePart 010020004_S-TOR-Sort-CASAP But what if i get filename like below: ... (3 Replies)
Discussion started by: gnnsprapa
3 Replies

2. UNIX for Beginners Questions & Answers

Extract data before period in filename

Hi, I have some files with these patterns: WI_SCOPE_DATA_CHANGE_2017-09-12_15-30-40.txt WI_SCOPE_BACK_COMPLETE_QUEUE_2017-09-12_15-31-40.txt WI_SCOPE_CURRENT_CHECK_QUEUE_2017-09-12_15-32-40.txt WI_SCOPE_DAILY_PARTY_2017-09-12_15-33-40.txt I want to extract date from filename and save it... (1 Reply)
Discussion started by: Home
1 Replies

3. Shell Programming and Scripting

How can I retrieve the matching records from data file mentioned?

XYZNA0000778800Z 16123000012300321000000008000000000000000 16124000012300322000000007000000000000000 17234000012300323000000005000000000000000 17345000012300324000000004000000000000000 17456000012300325000000003000000000000000 9 XYZNA0000778900Z 16123000012300321000000008000000000000000... (8 Replies)
Discussion started by: later_troy
8 Replies

4. UNIX for Dummies Questions & Answers

to extract all the part of the filename before a particular word in the filename

Hi All, Thanks in Advance I am working on a shell script. I need some assistance. My Requirement: 1) There are some set of files in a directory like given below OTP_UFSC_20120530000000_acc.csv OTP_UFSC_20120530000000_faf.csv OTP_UFSC_20120530000000_prom.csv... (0 Replies)
Discussion started by: aealexanderraj
0 Replies

5. Shell Programming and Scripting

extract every filename containing certain string in a directory and do some commend per file

Hi, Here is my question: suppose I have files like 1990_8xdaily_atmos.nc 1991_8xdaily_atmos.nc 1992_8xdaily_atmos.nc 1993_8xdaily_atmos.nc 1990_daily_atmos.nc 1991_daily_atmos.nc 1992_daily_atmos.nc 1993_daily_atmos.nc 1990_month_atmos.nc 1991_month_atmos.nc 1992_month_atmos.nc... (1 Reply)
Discussion started by: 1988PF
1 Replies

6. Shell Programming and Scripting

Parsing data using keys from one file

I have 2 text files where I need to parse data from file 2 using the data from file 1. Below are my sample files File 1 (tab delimited) 257 350 670 845 725 1025 767 820 ... .... .... file 2 (tab delimited) 220..450 TA AB650 ABCED 520..850 GA AB720 ABCDE 700..1100 TC AB820 ABCDE... (2 Replies)
Discussion started by: Lucky Ali
2 Replies

7. Shell Programming and Scripting

parsing data from a big file using keys from another smaller file

Hi, I have 2 files format of file 1 is: a1 b2 a2 c2 d1 f3 format of file 2 is (tab delimited): a1 1.2 0.5 0.06 0.7 0.9 1 0.023 a3 0.91 0.007 0.12 0.34 0.45 1 0.7 a2 1.05 2.3 0.25 1 0.9 0.3 0.091 b1 1 5.4 0.3 9.2 0.3 0.2 0.1 b2 3 5 7 0.9 1 9 0 1 b3 0.001 1 2.3 4.6 8.9 10 0 1 0... (10 Replies)
Discussion started by: Lucky Ali
10 Replies

8. Shell Programming and Scripting

Extract date from filename and create a new file

Hi, i have a filename CRED20102009.txt in a server 20102009 is the date of the file ddmmaaaa format the complete route is /dprod/informatica/Fuentes/CRED20102009.csv i want to extract the date to create a new file named Parameters.txt I need to create Parameters.txt with this... (6 Replies)
Discussion started by: angel1001
6 Replies

9. UNIX for Dummies Questions & Answers

Extract first line of a file and use as filename

I am trying to find a way to create a script which will extract the first line of a file and then rename the file (or create a new file with the same content as the old file) using the first line as the name. The first line being a single word, that is. I am hopeless at programming, if anyone can... (5 Replies)
Discussion started by: s.plumb
5 Replies

10. Shell Programming and Scripting

To extract <P> tags in a custom manner from below mentioned input.

Following is input: <P align="justify" ><FONT size="+1" color="#221E1F">the tiny bundles of hairs that protrude from them. Waves in the fluid of the inner ear stimulate the hair cells. Like the rods and cones in the eye, the hair cells convert this physical stimulation into neural im<FONT... (2 Replies)
Discussion started by: parshant_bvcoe
2 Replies
Login or Register to Ask a Question