Sponsored Content
Top Forums Shell Programming and Scripting awk to format each line by pattern Post 303018798 by cmccabe on Friday 15th of June 2018 09:49:51 AM
Old 06-15-2018
The awk works great... thank you. I found two additional format types and commented your code to try and capture these two additional. However I don't think I am understanding it correctly. Would you be able to comment it so I can try to make the changes... I added the bold portion to capture the pattern in line 5 (split $4 on the _ and capture the 2nd value if alpha). Also, I can't figure out how does a numeric value inside a () not get printed? Thank you very much Smilie.

Code:
awk '
  {
    split($4,F,/_/)            # split field 4 on _ and strore in F
    if(split(F[3],G,/[)(]/)) { # store value of 3rd field in G
        if(G[2]~/[[:alpha:]]/) # check that it's alpha and store in G[2]
        p=G[2]  # store G[2] as p
      else 
        p=G[1]  # if numeric store as p
    } 
    else 
      p=F[3]   # store spilt value as p
  }
  { 
    split($4,A,/_/)
     if(split(A[2],B,/[_]/)) {
      if(B[2]~/[[:alpha:]]/)
        p=B[2]
  }
   }
    {
    print "chr" $1, $2, $3, "chr" $1 ":" $2 "-" $3 OFS p  # print desired output
  }
' FS='\t' OFS='\t' in   # define FS and OFS as tab-delimited

in tab-delimited
Code:
6	18122723	18122843	469_380805_378884(NHLRC1)_1.1_1
6	31114121	31114241	344047_16724314_rs746647_1
6	31430946	31431066	344049_16724385_HCP5(10866)_1_1
6	32808479	32808599	445446_18754304_PSMB8-exon6_1
1	33478785	33478905	19186497_AK2-Exon1_1
1	24022788	24022908	466743_18956150_RPL11-NM_000975-exon6_1

desired output tab-delimited
Code:
chr6	18122723	18122843	chr6:18122723-18122843	NHLRC1
chr6	31114121	31114241	chr6:31114121-31114241	rs746647
chr6	31430946	31431066	chr6:31430946-31431066	HCP5
chr6	32808479	32808599	chr6:32808479-32808599	PSMB8-exon6
chr1	33478785	33478905	chr1:33478785-33478905	AK2-Exon1
chr1	24022788	24022908	chr1:24022788-24022908	RPL11-NM_000975-exon6


Last edited by cmccabe; 06-15-2018 at 10:51 AM.. Reason: fixed format
 

10 More Discussions You Might Find Interesting

1. Shell Programming and Scripting

awk: need to extract a line before a pattern

Hello , I need your help to extract a line in a big file , and this line is always 11 lines before a specific pattern . Do you know a way via Awk ? Thanks in advance npn35 (17 Replies)
Discussion started by: npn35
17 Replies

2. Shell Programming and Scripting

Include Line Before Pattern Using Sed / Awk

Hi, I have a sql file that runs something like this vi Test.sql REVOKE EXECUTE ON DEMO_USER.SQC_SAMP FROM PUBLIC; REVOKE EXECUTE ON DEMO_USER.SQC_SAMP FROM DEMO_READ; REVOKE SELECT ON DEMO_USER.DEMO_NOMINEE_TEST FROM DEMO_READ; REVOKE EXECUTE ON DEMO_USER.SQC_SAMP FROM... (3 Replies)
Discussion started by: rajan_san
3 Replies

3. Shell Programming and Scripting

awk script to move a line after the matched pattern line

I have the following text format in a file which lists the question first and then 5 choices after that the explanantion and finally the answer. 1.The amount of time it takes for most of a worker’s occupational knowledge and skills to become obsolete has been declining because of the... (2 Replies)
Discussion started by: nanchil_guy
2 Replies

4. Shell Programming and Scripting

Grep the word from pattern line and update in subsequent lines till next pattern line reached

Hi, I have got the below requirement. please suggest. I have a file like, Processing Item is: /data/ing/cfg2/abc.txt /data/ing/cfg3/bgc.txt Processing Item is: /data/cmd/for2/ght.txt /data/kernal/config.klgt.txt I want to process the above file to get the output file like, ... (5 Replies)
Discussion started by: rbalaj16
5 Replies

5. Shell Programming and Scripting

a cut-command or special format pattern in awk

Hi i read data with awk, 01.07.2012 00:10 227.72 247.50 1.227 1.727 17.273 01.07.2012 00:20 237.12 221.19 2.108 2.548 17.367 01.07.2012 00:30 230.38 230.34 3.216 3.755 17.412 01.07.2012 00:40 243.18 242.91 4.662 5.172 17.328 01.07.2012 00:50 245.58 245.41 5.179 5.721 17.128... (3 Replies)
Discussion started by: IMPe
3 Replies

6. Shell Programming and Scripting

awk to insert line previous to a pattern?

I have a very long line with certain patters embedded in there. I need to be able to read that line, and when it encounters that pattern, create a new line. I want the pattern to be the beginning of the new line. I thought sed or awk could do this, but everything I try in sed gives me a "sed... (2 Replies)
Discussion started by: Drenhead
2 Replies

7. Shell Programming and Scripting

awk to search for pattern and remove line

I am an awk beginner and need help figuring out how to search for a number in the first column and if it (or anything greater) exists, remove those lines. AM11400012012 2.26 2.12 1.98 2.52 3.53 3.01 3.62 5.00 3.65 7.95 0.79 3.88 0.00 AM11400012013 3.39 2.29 ... (1 Reply)
Discussion started by: ncwxpanther
1 Replies

8. Shell Programming and Scripting

awk - To retrieve an expression from the last line containing a pattern

Hi All, I'm new on this forum, and i'm trying since several days to find out a way to retrieve a expression from the last line containing a pattern. Could you please help me with this ? E.g. The file is containing the following lines 08/05 17:33:47 STAT1 Response(22) is... (4 Replies)
Discussion started by: Antonio Fargas
4 Replies

9. Shell Programming and Scripting

Maintain line format using awk

Hello I have a file with the following format: ... text1 num num P # 2014--2-28-22---6 33.76--38.4173---21.9403----0.08-0.00--0.01--0.01--0.46----------0 text1 num num P text 2 num num S text 3 num num P ... (where "-"=space, "spaces" cannot... (4 Replies)
Discussion started by: phaethon
4 Replies

10. Shell Programming and Scripting

awk to combine lines from line with pattern match to a line that ends in a pattern

I am trying to combine lines with these conditions: 1. First line starts with text of "libname VALUE db2 datasrc" where VALUE can be any text. 2. If condition1 is met then continue to combine lines through a line that ends with a semicolon. 3. Ignore case when matching patterns and remove any... (5 Replies)
Discussion started by: Wes Kem
5 Replies
JOIN(1) 						      General Commands Manual							   JOIN(1)

NAME
join - relational database operator SYNOPSIS
join [-an] [-e s] [-o list] [-tc] file1 file2 DESCRIPTION
Join forms, on the standard output, a join of the two relations specified by the lines of file1 and file2. If file1 is `-', the standard input is used. File1 and file2 must be sorted in increasing ASCII collating sequence on the fields on which they are to be joined, normally the first in each line. There is one line in the output for each pair of lines in file1 and file2 that have identical join fields. The output line normally con- sists of the common field, then the rest of the line from file1, then the rest of the line from file2. Fields are normally separated by blank, tab or newline. In this case, multiple separators count as one, and leading separators are dis- carded. These options are recognized: -an In addition to the normal output, produce a line for each unpairable line in file n, where n is 1 or 2. -e s Replace empty output fields by string s. -o list Each output line comprises the fields specified in list, each element of which has the form n.m, where n is a file number and m is a field number. -tc Use character c as a separator (tab character). Every appearance of c in a line is significant. SEE ALSO
sort(1), comm(1), awk(1). BUGS
With default field separation, the collating sequence is that of sort -b; with -t, the sequence is that of a plain sort. The conventions of join, sort, comm, uniq, look and awk(1) are wildly incongruous. 7th Edition April 29, 1985 JOIN(1)
All times are GMT -4. The time now is 03:33 AM.
Unix & Linux Forums Content Copyright 1993-2022. All Rights Reserved.
Privacy Policy