Visit Our UNIX and Linux User Community


Extract specific lines based on another file


 
Thread Tools Search this Thread
Top Forums Shell Programming and Scripting Extract specific lines based on another file
# 1  
Old 11-08-2015
Extract specific lines based on another file

I have a folder containing text files. I need to extract specific lines from the files of this folder based on another file input.txt. How can I do this with awk/sed?

Code:
file1
ARG	81.9	8	81.9	0
LEU	27.1	9	27.1	0
PHE	.0	10	.0	0
ASP	59.8	11	59.8	0
ASN	27.6	12	27.6	0
ALA	.0	13	.0	0
MET	13.1	14	13.1	0
LEU	66.8	15	66.8	0
ARG	21.0	16	21.0	0

file2
SER	57.9	43	57.9	0
PHE	2.4	44	2.4	0
LEU	39.4	45	1.0	38.4
GLN	83.9	46	40.8	43.1
ASN	46.9	47	46.9	0
PRO	47.1	48	4.8	42.3
GLN	86.1	49	83.2	2.9
THR	33.2	50	33.2	0
SER	10.2	51	.9	9.3

input.txt

*file1
10
16
*file2
43
44
49

Desired output

file1
PHE	.0	10	.0	0
ARG	21.0	16	21.0	0

file2
SER	57.9	43	57.9	0
PHE	2.4	44	2.4	0
GLN	86.1	49	83.2	2.9

# 2  
Old 11-08-2015
Any attempts from your side?

---------- Post updated at 09:56 ---------- Previous update was at 09:25 ----------

Howsoever, try
Code:
awk '
FNR==NR         {if (/^\*/)     {TFN = substr ($0,2)
                                 SAM[TFN] =  "-"
                                 next
                                }
                 SAM[TFN] = SAM[TFN] $0 "-"
                 next
                }

SAM[FILENAME] ~ "-" $3 "-"      {print > FILENAME ".res"}

' input  file1 file2
cf *.res
file1.res:
PHE      .0    10    .0      0
ARG    21.0    16    21.0    0
file2.res:
SER    57.9    43    57.9    0
PHE    2.4     44    2.4     0
GLN    86.1    49    83.2    2.9

This User Gave Thanks to RudiC For This Post:
# 3  
Old 11-08-2015
Hi Rudic,
Could you please explain the code. Not sure what i got is all perfect.
FNR==NR ## just compare the number of records of two files. While reading first file the condition would be true. So it will jump to just next block.
{if (/^\*/) ## search for pattern starting with asterisk , that i found no where so what is the purpose. ?
{TFN = substr ($0,2) ## for all fields starting 2 characters will be assigned to variable TFN
SAM[TFN] = "-" ## Associative array would be declared and assiged value "-" to them.
SAM[TFN] = SAM[TFN] $0 "-" ## if the IF block become false then add "-" after each record.

Rest all going over head.

Last edited by looney; 11-08-2015 at 01:51 PM..
# 4  
Old 11-08-2015
@looney

Code:
awk '
# start of code block for only first file at command line
FNR==NR         {if (/^\*/)     {TFN = substr ($0,2) # remove the * from the file name we want to structure, save in a Temporal File Name variable
                                 SAM[TFN] =  "-" # start by adding a marker ("-") for separation identification
                                 next   # skip to next line of first file at command line, ignore the remaining code blocks
                                }
                 # this expression is for the lines that contains not a filename denotation
                 SAM[TFN] = SAM[TFN] $0 "-" # append the whole record to the current id key; followed by separator markers 
                 next # skip to next line in same file, ignore the rest
                }
# end of block for first file at command line

# the following block applies only to the rest of the files in the command line, not the first
SAM[FILENAME] ~ "-" $3 "-"      {print > FILENAME ".res"} # look for the FILENAME (current file being processed), in the previously built data structure 
# and see if it can match the pattern created by the concatenation of "-" $3 "-" (example: "-10-" ); if it does send the current $0 to the file FILENAME.res


Last edited by Aia; 11-08-2015 at 03:05 PM..
# 5  
Old 11-08-2015
Another one:
Code:
awk 'NR==FNR{if(/^\*/) f=substr($1,2); else A[f,$1]; next} (FILENAME,$3) in A {print>(FILENAME ".res")}' input.txt file[12]


--
Note:

Use parentheses around the filename concatenation, or most awks will protest with a syntax error.
Code:
{print > (FILENAME ".res")}


Last edited by Scrutinizer; 11-08-2015 at 04:17 PM..
# 6  
Old 11-08-2015
Quote:
Originally Posted by looney
Hi Rudic,
Could you please explain the code. Not sure what i got is all perfect.
FNR==NR ## just compare the number of records of two files. While reading first file the condition would be true. So it will jump to just next block. <--- for the second ++ files
{if (/^\*/) ## search for pattern starting with asterisk , that i found no where so what is the purpose. ? <-- look into spec for input.txt
{TFN = substr ($0,2) ## for all fields starting 2 characters will be assigned to variable TFN <-- just for the line with the leading *; shortcut for substr ($0, 2, rest of $0) := filename to which comparison is to be applied
SAM[TFN] = "-" ## Associative array would be declared and assiged value "-" to them. <-- Yes, as a starting point; additional elements to be added
SAM[TFN] = SAM[TFN] $0 "-" ## if the IF block become false then add "-" after each record. <-- concatenate the fields into SAM separated by "-"

Rest all going over head.

Previous Thread | Next Thread
Test Your Knowledge in Computers #167
Difficulty: Easy
The Transmission Control Protocol (TCP) is one of the least used protocols of the Internet protocol suite.
True or False?

10 More Discussions You Might Find Interesting

1. Shell Programming and Scripting

Delete specific lines from files based on another file

I have some text files in a folder named ff as follows. I need to delete the lines (in-place editing)in these files based on another file aa.txt. 32bm.txt: 249 253 A P - 0 0 8 0, 0.0 6,-1.4 0, 0.0 2,-0.4 -0.287 25.6-102.0 -74.4 161.1 37.1 13.3 10.9 250... (2 Replies)
Discussion started by: aden
2 Replies

2. Shell Programming and Scripting

Extract error records based on specific criteria from Unix file

Hi, I look for a awk one liner for below issue. input file ABC 1234 abc 12345 ABC 4567 678 XYZ xyz ght 678 ABC 787 yyuu ABC 789 7890 777 zxr hyip hyu mno uii 678 776 ABC ty7 888 All lines should be started with ABC as first field. If a record has another value for 1st... (7 Replies)
Discussion started by: ratheesh2011
7 Replies

3. Shell Programming and Scripting

Combine multiple lines in file based on specific field

Hi, I have an issue to combine multiple lines of a file. I have records as below. Fields are delimited by TAB. Each lines are ending with a new line char (\n) Input -------- ABC 123456 abcde 987 890456 7890 xyz ght gtuv ABC 5tyin 1234 789 ghty kuio ABC ghty jind 1234 678 ght ... (8 Replies)
Discussion started by: ratheesh2011
8 Replies

4. Shell Programming and Scripting

Extracting specific lines of data from a file and related lines of data based on a grep value range?

Hi, I have one file, say file 1, that has data like below where 19900107 is the date, 19900107 12 144 129 0.7380047 19900108 12 168 129 0.3149017 19900109 12 192 129 3.2766666E-02 ... (3 Replies)
Discussion started by: Wynner
3 Replies

5. Shell Programming and Scripting

Extract specific lines from a file

Hi, I have a file which contains DDL statements- CREATE TABLE, CREATE INDEX, ALTER TABLE etc. I have to only pick CREATE TABLE statements from the file- Source : ---------------------------------------------- --DDL for table abc -------------------------------------------- CREATE TABLE... (4 Replies)
Discussion started by: newb
4 Replies

6. UNIX for Dummies Questions & Answers

unix: extract a specific list of lines from a file

I would like to extract specific lines from a file and output them into another file. Each line in the file has a unique ID, and I have a specific list of IDs (that are not consecutive) that I wish to extract. for example: 1 aaaaaa bbbcb cccccc 2 aaaaaa bbbbb cccccd 3 aaaaaa bbbab... (6 Replies)
Discussion started by: mert2481
6 Replies

7. UNIX for Dummies Questions & Answers

Extract a specific number from an XML file based on the start and end tags

Hello People, I have the following contents in an XML file ........... ........... .......... ........... <Details = "Sample Details"> <Name>Bob</Name> <Age>34</Age> <Address>CA</Address> <ContactNumber>1234</ContactNumber> </Details> ........... ............. .............. (4 Replies)
Discussion started by: sushant172
4 Replies

8. Shell Programming and Scripting

Extract lines of text based on a specific keyword

I regularly extract lines of text from files based on the presence of a particular keyword; I place the extracted lines into another text file. This takes about 2 hours to complete using the "sort" command then Kate's find & highlight facility. I've been reading the forum & googling and can find... (4 Replies)
Discussion started by: DionDeVille
4 Replies

9. Shell Programming and Scripting

extract the lines between specific line number from a text file

Hi I want to extract certain text between two line numbers like 23234234324 and 54446655567567 How do I do this with a simple sed or awk command? Thank you. ---------- Post updated at 06:16 PM ---------- Previous update was at 05:55 PM ---------- found it: sed -n '#1,#2p'... (1 Reply)
Discussion started by: return_user
1 Replies

10. UNIX for Dummies Questions & Answers

extract specific lines from file

hi, how would i extract a range of lines in a file by using the line number? ex: file contains: 1 title 2 i want 3 this part 4 to be taken out 5 from this file 6 and sent to 7 another file 8 not needed 9 end of file In this case, i want to copy line number 2 to 7 on a new... (2 Replies)
Discussion started by: apalex
2 Replies

Featured Tech Videos