Unix/Linux Go Back    


Shell Programming and Scripting BSD, Linux, and UNIX shell scripting — Post awk, bash, csh, ksh, perl, php, python, sed, sh, shell scripts, and other shell scripting languages questions here.

Help in awk/bash

Shell Programming and Scripting


Closed    
 
Thread Tools Search this Thread Display Modes
    #22  
Old Unix and Linux 01-01-2013   -   Original Discussion by bioinfo
bioinfo's Unix or Linux Image
bioinfo bioinfo is offline
Registered User
 
Join Date: Dec 2012
Last Activity: 12 August 2013, 3:07 AM EDT
Posts: 50
Thanks: 52
Thanked 0 Times in 0 Posts
Thanks. Linux

How should I start learning shell scripting/awk programming better. Any book?

Thanks again.
Sponsored Links
    #23  
Old Unix and Linux 01-02-2013   -   Original Discussion by bioinfo
Don Cragun's Unix or Linux Image
Don Cragun Don Cragun is offline Forum Staff  
Administrator
 
Join Date: Jul 2012
Last Activity: 14 December 2017, 6:02 AM EST
Location: San Jose, CA, USA
Posts: 10,774
Thanks: 590
Thanked 3,771 Times in 3,219 Posts
Quote:
Originally Posted by bioinfo View Post
Thanks a lot Don Cragun and Corona688. I edited script in vi and its working. Yippie Linux

I have one more query. I am using the following tro.txt as my input file for further program:



I wish to delete all following lines in this file:

Following entry (2659) comes from 265920.000 truncated:
Following entry (2703) comes from 270330.000 rounded:
Following entry (2703) comes from 270360.000 rounded:
..........................................................................
..........................................................................

Required output:



Please guide.
Thanks.
In addition to the grep Corona688 provided, you could also add another output file to the awk script I provided, or add an option to the script to control whether or not marker lines should be included in the tro.txt output file, or just always leave out the markers in the tro.txt output file.
The Following User Says Thank You to Don Cragun For This Useful Post:
bioinfo (01-02-2013)
Sponsored Links
    #24  
Old Unix and Linux 01-03-2013   -   Original Discussion by bioinfo
bioinfo's Unix or Linux Image
bioinfo bioinfo is offline
Registered User
 
Join Date: Dec 2012
Last Activity: 12 August 2013, 3:07 AM EDT
Posts: 50
Thanks: 52
Thanked 0 Times in 0 Posts
Hi,
I have two files:


Code:
11.txt showing two patterns:

ATOM 1 N SER A 1 35.092 83.194 140.076 1.00 0.00 N 
ATOM 2 CA SER A 1 35.216 83.725 138.725 1.00 0.00 C 
TER
ENDMDL
ATOM 1 N SER A 1 35.683 81.326 139.778 1.00 0.00 N 
ATOM 2 CA SER A 1 35.422 82.736 139.929 1.00 0.00 C 
TER
ENDMDL


Code:
c.txt

Number of groups: 40  3.95
Group: 0 Branches: 1
0    001
Centre: 001 Nodes: 1
Group: 1 Branches: 1
0    002
Centre: 002 Nodes: 1
Group: 2 Branches: 6
0    009
1    004
2    008
3    007
4    005
5    006
Centre: 006 Nodes: 6

ENDMDL is coming many times in 11.txt. I wish to retreive that pattern corresponds to the value of Id. It means, if I give input of 004 (Id) from group 2, then it should output the fourth repeat from 11. txt ending with ENDMDL.


Code:
Id004.txt

Group2: Id 004
ATOM 1 N SER A 1 35.092 83.194 140.076 1.00 0.00 N 
ATOM 2 CA SER A 1 35.216 83.725 138.725 1.00 0.00 C 
TER
ENDMDL

So, corresponding to value of Id from c.txt, I want to retreive the repeat at the number from 11.txt.


Please guide, how, corresponding to value of Id from c.txt, I can retreive the repeat at the number from 11.txt.
Also, I wish to retreive these patterns in individual files based on their Id, group, centre. For example:
group0.txt contains all patterns with Id
group1.txt contains all patterns with Id
group2.txt contains all patterns with Id
One file containing patterns with corresponding to centre ID

Code:
Id001.txt
Id002.txt
Id009.txt
............
............

Thanks

Last edited by Scrutinizer; 01-04-2013 at 12:40 AM.. Reason: quote tags -> code tags
    #25  
Old Unix and Linux 01-04-2013   -   Original Discussion by bioinfo
Don Cragun's Unix or Linux Image
Don Cragun Don Cragun is offline Forum Staff  
Administrator
 
Join Date: Jul 2012
Last Activity: 14 December 2017, 6:02 AM EST
Location: San Jose, CA, USA
Posts: 10,774
Thanks: 590
Thanked 3,771 Times in 3,219 Posts
Quote:
Originally Posted by bioinfo View Post
Hi,
I have two files:





ENDMDL is coming many times in 11.txt. I wish to retreive that pattern corresponds to the value of Id. It means, if I give input of 004 (Id) from group 2, then it should output the fourth repeat from 11. txt ending with ENDMDL.



So, corresponding to value of Id from c.txt, I want to retreive the repeat at the number from 11.txt.


Please guide, how, corresponding to value of Id from c.txt, I can retreive the repeat at the number from 11.txt.
Also, I wish to retreive these patterns in individual files based on their Id, group, centre. For example:
group0.txt contains all patterns with Id
group1.txt contains all patterns with Id
group2.txt contains all patterns with Id
One file containing patterns with corresponding to centre ID
Id001.txt
Id002.txt
Id009.txt
............
............

Thanks
This is the third or fourth problem you have posted to this thread. Reading through the thread it is getting hard to determine which problem is being addressed by some of the comments.

I have shown you how to read 11.txt, accumulate the entries in it for each set of lines ending with an ENDMDL line, and print selected entries from the accumulated list. You know what files you want to create and what you want in them, so why don't you try putting together an awk script to do that and let us know what isn't working.

From your description of groups, centres, and IDs, I have no idea how many files you want created nor what is supposed to be in each of them. I also don't see any use for the lines starting with Centre: in your c.txt file; they just have the characters Centre: followed by the Id of the last Branch in the Group that they follow, followed by the characters Nodes: , followed by the number of branches listed on the preceding Group: line. What is the difference between a Node and a Branch? What is the difference between a Group and a Centre?

If you can't do this awk script yourself, you're going to have to give us a lot more detail specifying the exact list of the files you want produced in response to the snippet from c.txt you provided, along with the data that you want written into those files.
The Following User Says Thank You to Don Cragun For This Useful Post:
bioinfo (01-04-2013)
Sponsored Links
    #26  
Old Unix and Linux 01-04-2013   -   Original Discussion by bioinfo
bioinfo's Unix or Linux Image
bioinfo bioinfo is offline
Registered User
 
Join Date: Dec 2012
Last Activity: 12 August 2013, 3:07 AM EDT
Posts: 50
Thanks: 52
Thanked 0 Times in 0 Posts
Thanks
I will post it in a new thread with more detail.
Sponsored Links
    #27  
Old Unix and Linux 01-07-2013   -   Original Discussion by bioinfo
bioinfo's Unix or Linux Image
bioinfo bioinfo is offline
Registered User
 
Join Date: Dec 2012
Last Activity: 12 August 2013, 3:07 AM EDT
Posts: 50
Thanks: 52
Thanked 0 Times in 0 Posts
Hi,
Script at # 15 is working great Linux
I have two questions related to it.

(1) If I only want patterns from 11.txt which are divisible by 100 with field 1 ( that means file for no entry if $1%100 != 0), only file no.txt
(2) Also, is it possible to number rows (whose 1st field is divisible by 100 and used for retreiving patterns from 11.txt) and also to number patters retreived from 11.txt

Shall I use following code for (1):

Code:
no=${1:-no.txt}         # name of file for no entry if $1%100 != 0
awk -v no="$no" 'BEGIN {rc = 1}
FNR == NR {r[rc] = r[rc] $0 "\n"
    if($0 == "ENDMDL") rc++
    next}
{   # If we got to here, we are reading lines from the 2nd file.
    # Determine exact, truncated, and rounded entry numbers.
    if (substr($1, length($1) - 5) == "00.000") {
        # $1 ends in 00.000; no truncation or rounding needed.
        entry = substr($1, 1, length($1) - 6)
        round = trunc = 0
    } else {
	# $1 is not evenly divisible by 100; calculate rounded and truncated
        # values.
        entry = 0
        round = sprintf("%.0f", $1 / 100)
        trunc = substr($1, 1, length($1) - 6)
    }
          # Write the appropriate entry
        # to each output file.
        printf("%s", r[entry]) > no
       } 
    }'
11.txt o.txt

Thanks.
Sponsored Links
    #28  
Old Unix and Linux 01-07-2013   -   Original Discussion by bioinfo
Don Cragun's Unix or Linux Image
Don Cragun Don Cragun is offline Forum Staff  
Administrator
 
Join Date: Jul 2012
Last Activity: 14 December 2017, 6:02 AM EST
Location: San Jose, CA, USA
Posts: 10,774
Thanks: 590
Thanked 3,771 Times in 3,219 Posts
Quote:
Originally Posted by bioinfo View Post
Hi,
Script at # 15 is working great Linux
I have two questions related to it.

(1) If I only want patterns from 11.txt which are divisible by 100 with field 1 ( that means file for no entry if $1%100 != 0), only file no.txt
(2) Also, is it possible to number rows (whose 1st field is divisible by 100 and used for retreiving patterns from 11.txt) and also to number patters retreived from 11.txt

Shall I use following code for (1):

Code:
no=${1:-no.txt}         # name of file for no entry if $1%100 != 0
awk -v no="$no" 'BEGIN {rc = 1}
FNR == NR {r[rc] = r[rc] $0 "\n"
    if($0 == "ENDMDL") rc++
    next}
{   # If we got to here, we are reading lines from the 2nd file.
    # Determine exact, truncated, and rounded entry numbers.
    if (substr($1, length($1) - 5) == "00.000") {
        # $1 ends in 00.000; no truncation or rounding needed.
        entry = substr($1, 1, length($1) - 6)
        round = trunc = 0
    } else {
	# $1 is not evenly divisible by 100; calculate rounded and truncated
        # values.
        entry = 0
        round = sprintf("%.0f", $1 / 100)
        trunc = substr($1, 1, length($1) - 6)
    }
          # Write the appropriate entry
        # to each output file.
        printf("%s", r[entry]) > no
       } 
    }'
11.txt o.txt

Thanks.
No. I assume that you tried running this awk script and got an error saying that your open "{" s didn't match your "}"s. Since you moved the filenames to be processed to a line of their own, if the awk script had run it would have tried to read both input files from standard input (not from 11.txt and o.txt). And, instead of skipping over lines that had $1 that did not end in 00.000, it would have written an entry for the 0th element in 11.txt. In this case you would get what you want since r[0] is an empty string and writing it to the file no wouldn't have done anything.

A corrected and simplified version of this script would be something like:

Code:
awk -v no="no.txt" 'BEGIN {rc = 1}
FNR == NR {r[rc] = r[rc] $0 "\n"
    if($0 == "ENDMDL") rc++
    next}
{   # If we got to here, we are reading lines from the 2nd file.
    # Determine exact, truncated, and rounded entry numbers.
    if (substr($1, length($1) - 5) == "00.000") {
        # $1 ends in 00.000; write an entry corresponding to this line.
        entry = substr($1, 1, length($1) - 6)

        # Write the appropriate entry
        # to each output file.
        printf("%s", r[entry]) > no
    }
}' 11.txt o.txt

Yes it is possible to number entries from 11.txt and to number rows from o.txt, but you'll have to specify what you mean by that by showing the exact output that you want to appear in no.txt when using your 11.txt and the following instead of your version of o.txt:

Code:
100.000
2010.000
1000.000

If you're talking about adding a tag line to the output specifying the entry # from 11.txt and the line number from o.txt, you have seen examples of how to produce tag lines in earlier scripts I have provided (including the script your stripped down to produce the script above). The entry number from 11.txt being printed is specified by the variable entry and the line number from o.txt producing an output line is specified by the variable FNR.

One way to add a tag doing this would be to change the last printf in the above script from:

Code:
        printf("%s", r[entry]) > no

to:

Code:
        printf("The following entry from line %d is for Branch %d:\n%s",
            FNR, entry, r[entry]) > no

If you want each line of output in no.txt to include the Branch #. That is also easy to do, but changes the code where entries are accumulated from 11.txt instead of changing the printf at the end of the script. If you want each line of output in no.txt to include the Branch # and the line # from o.txt, that can also be done, but it will involve changing the way the script accumulates and prints entries from 11.txt.

Last edited by Don Cragun; 01-07-2013 at 11:05 PM.. Reason: add missing [ICODE] tag
The Following User Says Thank You to Don Cragun For This Useful Post:
bioinfo (01-07-2013)
Sponsored Links
Closed

Thread Tools Search this Thread
Search this Thread:

Advanced Search
Display Modes

Linux More UNIX and Linux Forum Topics You Might Find Helpful
Thread Thread Starter Forum Replies Last Post
Using AWK in a bash script mb001 Shell Programming and Scripting 3 08-02-2011 04:19 PM
awk bash help a-gopal Shell Programming and Scripting 2 05-08-2009 09:39 PM
BASH, HASH and AWK Corpsehy UNIX for Dummies Questions & Answers 2 02-13-2009 02:54 AM
Is there any better way for sorting in bash/awk ahjiefreak Shell Programming and Scripting 7 10-31-2008 10:07 AM
BASH with AWK narasimhulu Shell Programming and Scripting 2 08-26-2008 12:59 AM



All times are GMT -4. The time now is 08:56 AM.