count and number instances of a character in sed or awk


 
Thread Tools Search this Thread
Top Forums Shell Programming and Scripting count and number instances of a character in sed or awk
# 1  
Old 11-19-2011
count and number instances of a character in sed or awk

I currently use LaTeX together with a sed script to set cloze test papers for my students. I currently pepend and equals sign to the front of the words I want to leave out in the finished test, =perpendicular, for example. I am able to number the blanks using a variable in LaTeX. I would like to switch from using LatTeX to using groff typesetting.

My question is this: can I label or replace the equals signs with a running count with sed or awk, or even troff/groff for that matter? If so, how might I do it?

Thanks
# 2  
Old 11-19-2011
A small awk script to preprocess your input:

Code:
awk '
    /=/ {
        for( i = 1; i <= NF; i++ )
        {
            if( substr( $(i), 1, 1 ) == "=" )
                $(i) = "__________[" count++ "]";    # corrected
        }
    }
    { print; }
' text-file

Using your original message, I put in some equals
Code:
I currently use LaTeX together with a sed script to
set cloze =test papers for my students. I currently pepend
=and equals sign to the front of the words I want to leave out
in the finished =test, =perpendicular, for example. I am
able to number the =blanks using a variable in LaTeX. I
would like to switch from using =LatTeX to using =groff typesetting.

and this is the output:
Code:
I currently use LaTeX together with a sed script to 
set cloze __________[0] papers for my students. I currently pepend 
__________[1] equals sign to the front of the words I want to leave out 
in the finished __________[2] __________[3] for example. I am 
able to number the __________[4] using a variable in LaTeX. I 
would like to switch from using __________[5] to using __________[6] typesetting.

Hope this gets you going. (I'm not a *roff heavy, so it might be possible to do at formatting time, I just am not sure.)

---------- Post updated at 12:09 ---------- Previous update was at 12:08 ----------

If you'd like numbering to start with 1, change count++ to ++count.

Last edited by agama; 11-19-2011 at 09:07 PM.. Reason: correction
This User Gave Thanks to agama For This Post:
# 3  
Old 11-19-2011
Thank you! Thank you! Thank you. This is exactly the type of thing I was looking for. My sed skills are rudimentary and my awk not even up to that level, but this gives me somewhere to start. I don't know if *roff can handle it at formatting time either, but this may just help me avoid trying to find out -- user/newbie friendly, *roff documentation is not terribly common.

Thanks again!

---------- Post updated at 02:01 AM ---------- Previous update was at 01:37 AM ----------

This seems to be an excellent start, but one small hiccup: running it on the following text gives mostly the desired results:

Code:
I am =honored to be with you =today at your commencement =from =one =of =the =finest universities in the world. I never =graduated from =college. Truth be told, this is the =closest I've =ever =gotten to a =college =graduation. Today I want to tell you three stories =from =my =life. That's it. =No =big =deal. Just three stories.


Results in :
Code:
I am __________[1] to be with you __________[2] at your commencement __________[3] __________[4] __________[5] __________[6] __________[7] universities in the world. I never __________[8] from __________[9] Truth be told, this is the __________[10] I've __________[11] __________[12] to a __________[9]=graduation. Today I want to tell you three stories __________[3] __________[13] __________[14] That's it. __________[15] __________[16] __________[17] Just three stories.

The __________[18] __________[19] is __________[20] connecting the __________[21]

Things go slightly haywire around the word "graduation." What might be going on?
Moderator's Comments:
Mod Comment
Please use code tags when posting data and code samples!

Last edited by vgersh99; 11-19-2011 at 03:30 PM.. Reason: code tags, please!
# 4  
Old 11-19-2011
I think you want something like this:
Code:
nawk '/=/{for(i=1;i<=NF;i++) if ($i ~/^=/) $i=("_______[" ++count "]" FS substr($i,2))}1' myFile

These 2 Users Gave Thanks to vgersh99 For This Post:
# 5  
Old 11-19-2011
@vgersh99


Quote:
nawk '/=/{for(i=1;i<=NF;i++) if ($i ~/^=/) $i=("_______" ++count FS substr($i,2))}1' myFile
Does the code
Code:
($i ~/^=/)

mean, that we search for the ^ beginning = equals sign. If we find it we prepend all the found strings in $i with ________.

Thanks,
jaysunn
# 6  
Old 11-19-2011
Quote:
Originally Posted by jaysunn
@vgersh99




Does the code
Code:
($i ~/^=/)

mean, that we search for the ^ beginning = equals sign. If we find it we prepend all the found strings in $i with ________.

Thanks,
jaysunn
Yes, almost. There's only one 'string' in $i. And prepend $i with "_____[' followed by the running tally of 'count' followed by "]" and FS (space in our case).
# 7  
Old 11-19-2011
Thanks for all replies. This seems to work beautifully now:
Code:
 awk '
    /=/ {
        for( i = 1; i <= NF; i++ )
        {
            if( substr( $(i), 1, 1 ) == "=" )
                $(i) = "__________[" count++ "]";    # corrected
        }
    }
    { print; }


This text:
Code:
I am =honored to be with you =today at your
commencement =from =one =of =the =finest universities
in the world. I never =graduated from =college. Truth
be told, this is the =closest I've =ever =gotten to a
=college =graduation. Today I want to tell you three
stories =from =my =life. That's it. =No =big =deal.
Just three stories.

gives:
Code:
I am __________[0] to be with you __________[1] at your
commencement __________[2] __________[3] __________[4] __________[5] __________[6] universities
in the world. I never __________[7] from __________[8] Truth
be told, this is the __________[9] I've __________[10] __________[11] to a
__________[12] __________[13] Today I want to tell you three
stories __________[14] __________[15] __________[16] That's it. __________[17] __________[18] __________[19]
Just three stories.

Sorry for the lack of tags and poor formatting before. This is my first thread/post on your site.

Moderator's Comments:
Mod Comment How to use code tags when posting data and code samples.

Last edited by Franklin52; 11-21-2011 at 08:01 AM..
Login or Register to Ask a Question

Previous Thread | Next Thread

10 More Discussions You Might Find Interesting

1. UNIX for Beginners Questions & Answers

awk or sed script to count number of occurrences and creating an average

Hi Friends , I am having one problem as stated file . Having an input CSV file as shown in the code U_TOP_LOGIC/U_HPB2/U_HBRIDGE2/i_core/i_paddr_reg_2_/Q,1,1,1,0,0,1,1,0,0,1,1,0,0,1,1,0,0,1,1,0,0,1,1,0,0,1,1,0,0,1,1,0,0,1,1,0,0,1,1,0,0,1,1,0,0,1,1,0,0,1,1,0,0,1,1,0,0,1,1,0,0,1,1,0,0,0,0... (4 Replies)
Discussion started by: kshitij
4 Replies

2. UNIX for Dummies Questions & Answers

[Solved] Awk: count occurrence of each character for every field

Hi, let's say an input looks like: A|C|C|D A|C|I|E A|B|I|C A|T|I|B as the title of the thread explains, I am trying to get something like: 1|A=4 2|C=2|B=1|T=1 3|I=3|C=1 4|D=1|E=1|C=1|B=1 i.e. a count of every character in each field (first column of output) independently, sorted... (4 Replies)
Discussion started by: beca123456
4 Replies

3. Shell Programming and Scripting

awk - count character count of fields

Hello All, I got a requirement when I was working with a file. Say the file has unloads of data from a table in the form 1|121|asda|434|thesi|2012|05|24| 1|343|unit|09|best|2012|11|5| I was put into a scenario where I need the field count in all the lines in that file. It was simply... (6 Replies)
Discussion started by: PikK45
6 Replies

4. Shell Programming and Scripting

count the number of instances in 2 columns using awk

Input A.1 Q.1 A.1 Q.2 A.1 Q.3 A.2 Q.4 Explanation: Final Output A.1 Q.1 s1 t1 A.1 Q.2 s1 t2 A.1 Q.3 s1 t3 A.2 Q.4 s5 t5 ---------- Post updated 09-28-12 at 03:38 AM ---------- Previous update was 09-27-12 at 09:10 AM ---------- Hi Guys, I was able to do until... (11 Replies)
Discussion started by: quincyjones
11 Replies

5. Shell Programming and Scripting

Count number of character occurence but not from quotation marks

I have the following string: 31-01-2012, 09:42:37;OK;94727132638;"Mozilla/5.0 (Linux; U; Android 2.2.1)";3G;WAP;I need a script which is counting the occurrence of semicolons ( ; ) but exclude the ones from the quotation marks. In the string given as example there are 8 semicolons but the script... (3 Replies)
Discussion started by: calinlicj
3 Replies

6. Shell Programming and Scripting

Awk - Count instances of a number in col1 and put results in a col2 (new) of diff file

I have 2 files as follows: filename1: : 6742 /welcome/mundial98_ahf1_404.htm 1020 6743 /welcome/mundial98_ahf1_404.htm 2224 6744 /welcome/mundial_ef1_404.htm 21678 6745 /welcome/mundial_if_404.htm 4236 6746 /welcome/mundial_lf1_404.htm 21678 filename2: 6746 894694763 1... (2 Replies)
Discussion started by: jontjioe
2 Replies

7. Shell Programming and Scripting

Count number of occurences of a character in a field defined by the character in another field

Hello, I have a text file with n lines in the following format (9 column fields): Example: contig00012 149606 G C 49 68 60 18 c$cccccacccccccccc^c I need to count the number of lower-case and upper-case occurences in column 9, respectively, of the... (3 Replies)
Discussion started by: s052866
3 Replies

8. UNIX for Dummies Questions & Answers

count number of fields not using SED or AWK

hi forums i need help with a little problem i am having. i need to count the number of fields that are in a saved variable so i can use that number to make a different function work properly. is there a way of doing this without using SED/AWK? anything would be greatly appreciated (4 Replies)
Discussion started by: strasner
4 Replies

9. Shell Programming and Scripting

hw can i count the number of character in a file by perl

i want to count the number of character contained in afile using perl cript help me out (1 Reply)
Discussion started by: trupti_rinku
1 Replies

10. UNIX for Dummies Questions & Answers

Sed character number specification

I have the following sed command: sed '/7361105/s/^\(..............................................................................................................................................................................\)....\(.*\)/\16776\2/' arch.txt > arch.tmp && mv arch.tmp arch.txt ... (4 Replies)
Discussion started by: mvalonso
4 Replies
Login or Register to Ask a Question