Print lines meet requirement


 
Thread Tools Search this Thread
Top Forums UNIX for Dummies Questions & Answers Print lines meet requirement
# 1  
Old 10-20-2014
Print lines meet requirement

Dear Masters,

I have 2 files input below

file1
Code:
8269229289|CROATIA|LUX
8269229412|ASIA|LUX
8269229371|EUROPE|LUX
8269229355|LANE|LUX
8269229469|SWISS|LUX
8269229477|HAMBURG|LUX
8269229484|EGYPT|LUX
8269229485|GERMANY|LUX
8269229498|CROATIA|LUX

File2
Code:
8269229289|1100100020
8269229289|1100100122
8269229412|1100100128
82160398467|1100100140
8269229412|1100100195
8269229412|1100100202
8269229355|1100100550
8269229484|1100100568
8269229484|1100100612

I need to print file 1 based on occurence in file 2, so my output should like
Code:
8269229289|CROATIA|LUX|1100100020
8269229289|CROATIA|LUX|1100100122
8269229412|ASIA|LUX|1100100128
8269229371|EUROPE|LUX|
8269229355|LANE|LUX|1100100550
8269229469|SWISS|LUX|
8269229477|HAMBURG|LUX|
8269229484|EGYPT|LUX|1100100568
8269229484|EGYPT|LUX|1100100612
8269229485|GERMANY|LUX|
8269229498|CROATIA|LUX|

8269229289 will appear twice because in file2, 8269229289 also appears twice


I did this
Code:
awk -F'|' 'NR==FNR {h[$1] = $2; next} {FS=OFS="|";print $0,h[$1]}'

but the lines only appear once

Please Help
# 2  
Old 10-20-2014
Try

Code:
awk 'NR==FNR{
		h[$1] =( $1 in h ) ? h[$1] OFS $2 : $2
		next
            }
            {
		if(split(h[$1],part,OFS)>1)
		{ 
			for(i in part)
			{	
				print $0 OFS part[i]
			}
			next			
		} 
		print $0,(length(h[$1])?h[$1]:"Empty")
	    }
     ' FS="|" OFS="|" file2 file1

This User Gave Thanks to Akshay Hegde For This Post:
# 3  
Old 10-20-2014
oh my gosh..
can you explain step by step.
it works
# 4  
Old 10-20-2014
Quote:
Originally Posted by radius
oh my gosh..
can you explain step by step.
it works
Code:
awk 'NR==FNR{
		# Read file2 first and then
		# create hash using column1 ($1) as index and column2 ($2) as element
		# if index $1 is already exists then append current line $2 to existing element
		# where separator between old element and new element is OFS

		h[$1] =( $1 in h ) ? h[$1] OFS $2 : $2

		# Stop processing go to next line
		next
            }
           {
		# Here we read file1
		# split(string, array, fieldsep)
   		# This divides string into pieces separated by fieldsep, and stores the pieces in array and
		# returns the number of elements created
		# if elements created is greater than 1, then we have some more elements yet to print

		if(split(h[$1],part,OFS)>1)
		{ 
			# start looking through array part
			for(i in part)
			{	
				# print current line OFS and array element
				print $0 OFS part[i]
			}
			# stop processing go to next line
			next			
		} 

		# print current line OFS and content of hash h, for index $1
		# if length of element is zero then print string "empty"
		print $0,(length(h[$1])?h[$1]:"Empty")
		
		
	    }
     ' FS="|" OFS="|" file2 file1

# 5  
Old 10-20-2014
Why doesn't "8269229412" show up thrice in your sample output file?

Try (adapting your own approach):
Code:
awk     'NR==FNR        {h[$1] = $0; next}
         $1 in h        {print h[$1],$NF; N[$1]}
         END            {for (i in h) if (!(i in N)) print h[i]}
        ' FS=\| OFS=\| file1 file2
8269229289|CROATIA|LUX|1100100020
8269229289|CROATIA|LUX|1100100122
8269229412|ASIA|LUX|1100100128
8269229412|ASIA|LUX|1100100195
8269229412|ASIA|LUX|1100100202
8269229355|LANE|LUX|1100100550
8269229484|EGYPT|LUX|1100100568
8269229484|EGYPT|LUX|1100100612
8269229371|EUROPE|LUX
8269229469|SWISS|LUX
8269229477|HAMBURG|LUX
8269229485|GERMANY|LUX
8269229498|CROATIA|LUX

@Akshay Hegde: your approach seems to print extra lines:
Code:
8269229289|CROATIA|LUX|
8269229289|CROATIA|LUX|1100100020
8269229289|CROATIA|LUX|1100100122
etc.


Last edited by RudiC; 10-20-2014 at 10:21 AM..
# 6  
Old 10-20-2014
How about

Code:
 
awk -F'|' 'NR==FNR{a[$1]=$0;next}{for (i in a) gsub(i,a[i])}1' file2 file1

# 7  
Old 10-20-2014
Hello radius,

Following may help you but it has limitation of being used only for the shown input as I have hardcoded the number of lines in solution but may be helpful.( Working on how to get number of lines of all input files so that there should not be a need to hardcode it).

Code:
awk -F"|" 'FNR==NR{A[$1]=$0;next} NR>=10 && NR<=18($1 in A && A[$1]){if(A[$1]){print A[$1] OFS $2;next}} NR>18($1 in A){delete A[$1];next} END{for(r in A){print A[r]}}' OFS="|"  file1 file2 file2

As of now it will work only for given input, I am tryingto get the without hardcoding one solution.

Hello senhia83,
Quote:
How about

awk -F'|' 'NR==FNR{a[$1]=$0;next}{for (i in a) gsub(i,a[i])}1' file2 file1
This is not giving as requested output by OP.


Thanks,
R. Singh
 
Login or Register to Ask a Question

Previous Thread | Next Thread

10 More Discussions You Might Find Interesting

1. UNIX for Beginners Questions & Answers

Print number of lines for files in directory, also print number of unique lines

I have a directory of files, I can show the number of lines in each file and order them from lowest to highest with: wc -l *|sort 15263 Image.txt 16401 reference.txt 40459 richtexteditor.txt How can I also print the number of unique lines in each file? 15263 1401 Image.txt 16401... (15 Replies)
Discussion started by: spacegoose
15 Replies

2. Shell Programming and Scripting

awk to print lines that meet conditions and have value in another file

I am trying to use awk to print lines that satisfy either of the two conditions below: condition 1: $2 equals CNV and the split of $3, the value in red, is greater than or equal to 4. ---- this is a or so I think condition 2: $2 equals CNV and the split of $3, the value in red --- this is a... (4 Replies)
Discussion started by: cmccabe
4 Replies

3. Shell Programming and Scripting

Print header and lines that meet both conditions in awk

In the awk below I am trying to print only the header lines starting with # or ## and the lines that $7 is PASS and AF= is less than 5%. The awk does execute but returns an empty file and I am not sure what I am doing wrong. Thank you. file ... (0 Replies)
Discussion started by: cmccabe
0 Replies

4. Shell Programming and Scripting

awk to capture lines that meet either condition

I am trying to modify and understand an awk written by @Scrutinizer The below awk will filter a list of 30,000 lines in the tab-delimited file. What I am having trouble with is adding a condition to SVTYPE=CNV that will only print that line if CI=,0.95: portion in blue in file is <1.9. The... (2 Replies)
Discussion started by: cmccabe
2 Replies

5. Shell Programming and Scripting

Only print specific xml values that meet two criteria in python

I have a large XML file that I want to parse, and only print one specific value if two values are met. This is the code so far: #!/usr/local/bin/python import xml.etree.ElementTree as ET tree = ET.parse('onedb-dhcp.xml') root = tree.getroot() # This successfully gets all... (1 Reply)
Discussion started by: brianjb
1 Replies

6. Shell Programming and Scripting

awk to print matching lines in files that meet critera

In the tab delimited files below I am trying to match $2 in file1 to $2 of file2. If a match is found the awk checks $3 of file2 and if it is greater than 40% and $4 of file2 is greater than 49, the line in file1 is printed. In the desired output line3 of file1 is not printed because $3 off file2... (9 Replies)
Discussion started by: cmccabe
9 Replies

7. UNIX for Dummies Questions & Answers

awk - (URGENT!) Print lines sort and move lines if match found

URGENT HELP IS NEEDED!! I am looking to move matching lines (01 - 07) from File1 and 77 tab the matching string from File2, to File3.txt. I am almost done but - Currently, script is not printing lines to File3.txt in order. - Also the matching lines are not moving out of File1.txt ... (1 Reply)
Discussion started by: High-T
1 Replies

8. UNIX for Dummies Questions & Answers

Count when meet requirement

I have my file input Land,A,091374346294,Cathay,165 Island,B,091370291502,Cathay,3325 Island,P,091366545904,Cathay,440 Island,C,091368476591,Cathay,99000 Land,A,091379924879,Cathay,0 Land,P,091378222275,Cathay,245 Water,X,091369911459,Cathay,0 Island,B,091377596759,Cathay,0... (5 Replies)
Discussion started by: radius
5 Replies

9. Shell Programming and Scripting

print first few lines, then apply regex on a specific column to print results.

abc.dat tty cpu tin tout us sy wt id 0 0 7 3 19 71 extended device statistics r/s w/s kr/s kw/s wait actv wsvc_t asvc_t %w %b device 0.0 133.2 0.0 682.9 0.0 1.0 0.0 7.2 0 79 c1t0d0 0.2 180.4 0.1 5471.2 3.0 2.8 16.4 15.6 15 52 aaaaaa1-xx I want to skip first 5 line... (4 Replies)
Discussion started by: kchinnam
4 Replies

10. Shell Programming and Scripting

print lines AFTER lines cointaining a regexp (or print every first and fourth line)

Hi all, This should be very easy but I can't figure it out... I have a file that looks like this: @SRR057408.1 FW8Y5CK02R652T length=34 AGCAGTGGTATCAACGCAGAGTAAGCAGTGGTAT +SRR057408.1 FW8Y5CK02R652T length=34 FIIHFF6666?=:88@@@BBD:::?@ABBAAA>8 @SRR057408.2 FW8Y5CK02TBMHV length=52... (1 Reply)
Discussion started by: kmkocot
1 Replies
Login or Register to Ask a Question