Grep a string and count following lines starting with another string


 
Thread Tools Search this Thread
Top Forums Shell Programming and Scripting Grep a string and count following lines starting with another string
# 1  
Old 10-14-2015
Network Grep a string and count following lines starting with another string

I have a large dataset with following structure;

Code:
  C 0001 Carbon [C]
  D SAR001 methane [CH3]
  D SAR002 ethane
  D SAR003 propane
  D SAR004 butane
  D SAR005 pentane
  C 0002 Hydrogen [H]
  C 0003 Nitrogen [N]
  C 0004 Oxygen [O]
  D SAR011 ozone
  D SAR012 super oxide
  C 0005 Sulphur [S]
  D SAR013 Hydrogen Sulphide [H2S]
  D SAR014 Sulphuric acid
  .
  .
  .

In this dataset, lines starting with C are the headings and those with D are the components of their headings. I want to count the number of components in each heading and desires the output as;

Code:
0001 5
0002 0
0003 0
0004 2
0005 2
.
.
.

The pseudo code can be;

Code:
grep ^C
count next lines with ^D
print [$2 of ^C] and [count of ^D]
restart loop


Last edited by Scrutinizer; 10-14-2015 at 04:12 AM.. Reason: Code tags also for data samples
# 2  
Old 10-14-2015
Hi, try:
Code:
awk '$1=="C"{i=$2; A[i]=0} $1=="D"{A[i]++} END{for(i in A) print i,A[i]}' file

or

Code:
awk '$1=="C"{if(i!="") print i, c; i=$2; c=0} $1=="D"{c++} END{print i, c}' file

This User Gave Thanks to Scrutinizer For This Post:
# 3  
Old 10-14-2015
The second solution more "verbose":
Code:
awk '
function pr() {if (notfirst++) print heading,dcnt}
$1=="C" {pr(); heading=$2; dcnt=0}
$1=="D" {dcnt++}
END {pr()}
' file

This User Gave Thanks to MadeInGermany For This Post:
# 4  
Old 10-14-2015
Different approach:
Code:
awk '/^ *C/ {if (L) print NR-L-1;  printf "%s\t", $2; L=NR} END {print NR-L}' file
0001    5
0002    0
0003    0
0004    2
0005    2

This User Gave Thanks to RudiC For This Post:
Login or Register to Ask a Question

Previous Thread | Next Thread

10 More Discussions You Might Find Interesting

1. Shell Programming and Scripting

Delete all lines except a line starting with string

Shell : bash OS : RHEL 6.8 I have a file like below. $ cat pattern.txt hello txt1 txt2 txt3 some other text txt4 I want to remove all lines in this file except the ones starting with txt . How can I do this ? (4 Replies)
Discussion started by: omega3
4 Replies

2. Shell Programming and Scripting

How to delete lines starting with specific string?

Dear all, I would like to delete even lines starting with "N" together with their respective titles which are actually odd lines. Below is the example of input file. I would like to remove line 8 and 12 together with its title line, i.e., line 7 and 11, respectively.... (2 Replies)
Discussion started by: huiyee1
2 Replies

3. Shell Programming and Scripting

Awk, sed - concatenate lines starting with string

I have a file that looks like this: John Smith http://www.profile1.com http://www.profile2.com http://www.profile3.com Marc Olsen http://www.profile4.com http://www.profile5.com http://www.profile6.com http://www.profile7.com Lynne Doe http://www.profile8.com http://www.profile9.com... (3 Replies)
Discussion started by: locoroco
3 Replies

4. Shell Programming and Scripting

Recursive find / grep within a file / count of a string

Hi All, This is the first time I have posted to this forum so please bear with me. Thanks also advance for any help or guidance. For a project I need to do the following. 1. There are multiple files in multiple locations so I need to find them and the location. So I had planned to use... (9 Replies)
Discussion started by: Charlie6742
9 Replies

5. Shell Programming and Scripting

Grep a string from input file and delete next three lines including the line contains string in xml

Hi, 1_strings file contains $ cat 1_strings /home/$USER/Src /home/Valid /home/Review$ cat myxml <projected value="some string" path="/home/$USER/Src"> <input 1/> <estimate value/> <somestring/> </projected> <few more lines > <projected value="some string" path="/home/$USER/check">... (4 Replies)
Discussion started by: greet_sed
4 Replies

6. Shell Programming and Scripting

Print lines between two lines after grep for a text string

I have several very large file that are extracts from Oracle tables. These files are formatted in XML type syntax with multiple entries like: <ROW> some information more information </ROW> I want to grep for some words, then print all lines between <ROW> AND </ROW>. Can this be done with AWK?... (7 Replies)
Discussion started by: jbruce
7 Replies

7. Shell Programming and Scripting

Grep string from logs of last 1 hour on files of 2 different servers and calculate count

Hi, I am trying to grep a particular string from the files of 2 different servers without copying and calculate the total count of its occurence on both files. File structure is same on both servers and for reference as follows: 27-Aug-2010... (4 Replies)
Discussion started by: poweroflinux
4 Replies

8. Shell Programming and Scripting

awk: sort lines by count of a character or string in a line

I want to sort lines by how many times a string occurs in each line (the most times first). I know how to do this in two passes (add a count field in the first pass then sort on it in the second pass). However, can it be done more optimally with a single AWK command? My AWK has improved... (11 Replies)
Discussion started by: Michael Stora
11 Replies

9. UNIX for Dummies Questions & Answers

Grep and count the string in a file.

Hi, I have to grep a word 'XYZ' from 900 files ( from 2007 till date), take its count month wise. The output should display month, count , word 'XYZ' . I tried searching the forum for solution but could find any. I would apprieciate if any one can help me asap .... Many Thanks:) (12 Replies)
Discussion started by: vikram2008
12 Replies

10. Shell Programming and Scripting

How to print the number of lines from a file, the starting string should be passed`

Hi , I have file, which has the below content: line 100 a b c d line300 a s d f s line200 a s d a (3 Replies)
Discussion started by: little_wonder
3 Replies
Login or Register to Ask a Question