Count lines containing substring


 
Thread Tools Search this Thread
Top Forums Shell Programming and Scripting Count lines containing substring
# 1  
Old 02-05-2013
Count lines containing substring

I have 2 files, and I want to count how many lines contain matching words.

Example:
file1
Code:
a_+b
a_+b_+c

file2
Code:
ab a_+b
a_+bc

I want to get 1, as the the first line of file1 is a substring of the first line of file2. While the second line isn't.

I suspect using sdiff, but not sure how to do it in way to mach a whole word (each word is separated by a space).
# 2  
Old 02-05-2013
Your problem statement is ambiguous, but it's possible that the only thing you need is grep. Look at the -F, -f, and -c options.

Regards,
Alister
# 3  
Old 02-05-2013
Hi Alister,

I looked at these options, but I am not so sure how to use them here.

So in other words, the problem is:
given line i in file1, is there a matching word in line i at file2?
If yes, count this line.
# 4  
Old 02-05-2013
The following assumes that there are no tabs in each file. If there are, you'll need to choose an appropriate delimiter.
Code:
paste file1 file2 | awk -F'\t' 'index($2, $1)' | wc -l

If you or someone else requires every last bit of efficiency, the entire thing can be done within AWK. I optimized for simplicity.

Regards,
Alister
Login or Register to Ask a Question

Previous Thread | Next Thread

10 More Discussions You Might Find Interesting

1. Shell Programming and Scripting

Remove lines matching a substring in a specific column

Dear group, I have following input text file: Brit 2016 11 18 12 00 10 1.485,00 EUR Brit 2016 11 18 12 00 10 142,64 EUR Brit 2016 11 18 12 00 10 19,80 EUR Brit 2016 11 18 12 00 10 545,00 EUR Brit 2016 11 18 12 00 10 6.450,00 EUR... (3 Replies)
Discussion started by: gfhsd
3 Replies

2. Shell Programming and Scripting

Count lines

Hello, I have a file with two columns like the following FILE1: chr1 61042 chr1 61153 chr1 61446 chr1 61457 chr1 61621 chr10 61646 chr10 61914 chr10 62024 chr10 62782 Alos, I have another file FILE2: (13 Replies)
Discussion started by: rkk
13 Replies

3. Shell Programming and Scripting

Using sed, awk or perl to remove substring of all lines except the first

Greetings All, I would like to find all occurences of a pattern and delete a substring from the all matching lines EXCEPT the first. For example: 1234::group:user1,user2,user3,blah1,blah2,blah3 2222::othergroup:user9,user8 4444::othergroup2:user3,blah,blah,user1 1234::group3:user5,user1 ... (11 Replies)
Discussion started by: jacksolm
11 Replies

4. Shell Programming and Scripting

awk to loop lines and print substring

The input data: mbeanName: com.sap.default:name=tc~bl~deploy_controller,j2eeType=SAP_J2EEServicePerNode,SAP_J2EEClusterNode=3620850,SAP_J2EECluster=XXX Get attribute Properties: {undepl_parallelism_strategy=normal, deployment_forbidden=off, locking_retries=50, suppress_ds_warnings=on,... (5 Replies)
Discussion started by: ux4me
5 Replies

5. Shell Programming and Scripting

Count certain lines

Hi! I have a file that looks like this: AAG ---------------------------------------------------------------------- Number of residues in the repeat = 3 AGA ---------------------------------------------------------------------- Number of residues in the repeat = 3 AGG ... (2 Replies)
Discussion started by: vanesa1230
2 Replies

6. Shell Programming and Scripting

Count the no of lines between two words

Please help in the following problem: Input is: Pritam 123 456 Patil myname youname Pritam myproject thisproject iclic Patil remaining text some more text I need the command which will display the no of lines between two words in the whole file. e.g. Display all the no of lines... (5 Replies)
Discussion started by: zsudarshan
5 Replies

7. Shell Programming and Scripting

How to extract a substring and append to subsequent lines

Hi all,I am really new to Shell Scripting.I have the following doubt. Let us assume the one sample file which contains the below data HEADERCARMENTRACIE1555090414 PERIOD0905090501090531 DETAIL0645693037023073836 GROUNDAV 090501 01 GROUNDAV 090502 01 TRIP 0091282542 0905084101... (5 Replies)
Discussion started by: jaligamasriniva
5 Replies

8. Shell Programming and Scripting

Finding duplicates from positioned substring across lines

I have million's of records each containing exactly 50 characters and have to check the uniqueness of 4 character substring of 50 character (postion known prior) and report if any duplicates are found. Eg. data... AAAA00000000000000XXXX0000 0000000000... upto50 chars... (2 Replies)
Discussion started by: gapprasath
2 Replies

9. Shell Programming and Scripting

How to sort lines by substring

Dear all There is a file which contains the following formatted files. I need to sort it by substring(strings after dot) in order to process efficiently. Please give me any idea how to sort it. Sample file: 1AAABBBCCC.20080401 1AAABBBCCC.20080402 2AAABBBCCC.20080401... (3 Replies)
Discussion started by: mr_bold
3 Replies

10. UNIX for Dummies Questions & Answers

How to count lines - ignoring blank lines and commented lines

What is the command to count lines in a files, but ignore blank lines and commented lines? I have a file with 4 sections in it, and I want each section to be counted, not including the blank lines and comments... and then totalled at the end. Here is an example of what I would like my... (6 Replies)
Discussion started by: kthatch
6 Replies
Login or Register to Ask a Question