Sponsored Content
Top Forums Shell Programming and Scripting awk to format file with conditional split Post 302999751 by RavinderSingh13 on Tuesday 27th of June 2017 02:42:25 PM
Old 06-27-2017
Hello cmccabe,

Nice explanation. I too added complete explanation here.
Code:
awk 'NR>1{ ###checking here if line number is greater than 1 here, if yes then perform following steps.
                if($4 !~ /del/ && $4 !~ /ins/){ ###Checking here if 4th field is NOT having del and ins strings in them.
                                                REF=substr($4,index($4,">")-1,1); ###creating variable REF which is a substring starting from index of char > -1 to 1 character.
                                                OBS=substr($4,index($4,">")+1,1) ###creating variable OBS which is having substring of 4th field whose starting index is the index number of > +1 to till 1 character.
                                              }
                else                          { ###If a line does not have ins and del strings within them then perform following actions.
                                                e=$4; ###creating a variable named e whose value is 4th field.
                                                if($4 ~ /ins/){ ###checking if 4th field has ins string in it.
                                                                sub(/.*ins/,"",e); ###substituting all characters from starting to till ins string with NULL in e variable.
                                                                REF=""; ###Making variable REF to NULL.
                                                                OBS=e ###Making variable named OBS to new value of variable e now.
                                                              };
                                                if($4 ~ /del/){ ###Checking here if any 4th field has string del in it.
                                                                sub(/.*del/,"",e); ###substituting all characters from starting to till string del with NULL in variable named e.
                                                                REF=e; ###creating variable REF to new value of variable e here.
                                                                OBS="" ###creating OBS variable with NULL value.
                                                              }
                                              };
                CHR=substr($7,1,index($7,":")-1); ###creating variable CHR with substring whose starting point is 1st position to till index of colons index -1.
                CHR_1=substr($7,index($7,":")+1,index($7,"-")-4);### creating variable named CHR_1 with a substring whose starting point is index of colon +1 till index of dash -4 value in 7th field.
                CHR_2=substr($7,index($7,"-")+1);### creating a variable named CHR_2 here whose value is substring whose starting point is index of dash value +1 till the end of 7th field value.
                if($4 !~ /ins/)               { ###Checking if 4th field does not have ins string in it, if yes then perform following.
                                                print "chr"CHR,CHR_1 CHR_2$4,"REF="REF,"OBS="OBS,$1  ###printing values of variables CHR,CHR_1,CHR_2 etc.
                                              }
                else                          { ###Checking if 4th field does not have ins value in it, then perform following.
                                                print "chr"CHR,CHR_1,CHR_2,$4,"REF="REF,"OBS="OBS,$1 ###print the value of variables CHR,CHR_1,CHR_2 etc.
                                              }
         }
    ' Input_file                                ###Mentioning the Input_file name here.

Thanks,
R. Singh
 

10 More Discussions You Might Find Interesting

1. Shell Programming and Scripting

conditional split

Hi, Can someone let me know how I can split a record when it contains a vairable length of fields. Line1 field1,field101,field102,field 103,field104 Line 2 field1,field101,field102,field 103,field104,field201,field202,field 203,field204 Line 3 field1,field101,field102,field... (5 Replies)
Discussion started by: braindrain
5 Replies

2. UNIX for Dummies Questions & Answers

Split a file with no pattern -- Split, Csplit, Awk

I have gone through all the threads in the forum and tested out different things. I am trying to split a 3GB file into multiple files. Some files are even larger than this. For example: split -l 3000000 filename.txt This is very slow and it splits the file with 3 million records in each... (10 Replies)
Discussion started by: madhunk
10 Replies

3. Shell Programming and Scripting

AWK CSV to TXT format, TXT file not in a correct column format

HI guys, I have created a script to read 1 column in a csv file and then place it in text file. However, when i checked out the text file, it is not in a column format... Example: CSV file contains name,age aa,11 bb,22 cc,33 After using awk to get first column TXT file... (1 Reply)
Discussion started by: mdap
1 Replies

4. Shell Programming and Scripting

Split variable length and variable format CSV file

Dear all, I have basic knowledge of Unix script and her I am trying to process variable length and variable format CSV file. The file length will depend on the numbers of Earnings/Deductions/Direct Deposits. And The format will depend on whether it is Earnings/Deductions or Direct Deposits... (2 Replies)
Discussion started by: chechun
2 Replies

5. Shell Programming and Scripting

Need to split a xml file in proper format

Hi, I have a file which has xml data but all in single line Ex - <?xml version="1.0"?><User><Name>Robert</Name><Location>California</Location><Occupation>Programmer</Occupation></User> I want to split the data in proper xml format Ex- <?xml version="1.0"?> <User> <Name>Robert</Name>... (6 Replies)
Discussion started by: avishek007
6 Replies

6. Shell Programming and Scripting

Split File by Pattern with File Names in Source File... Awk?

Hi all, I'm pretty new to Shell scripting and I need some help to split a source text file into multiple files. The source has a row with pattern where the file needs to be split, and the pattern row also contains the file name of the destination for that specific piece. Here is an example: ... (2 Replies)
Discussion started by: cul8er
2 Replies

7. Shell Programming and Scripting

How to Split a source file in specified format?

Requirement: Need to split a source file say a1.txt which can be of size upto 150 MB into 25 target files each with a max size of 25 MB along with the header line in each target file. NOTE: Few target files can be empty also ,but 25 files must be generated for 1 source file( I can expect upto... (4 Replies)
Discussion started by: mad_man12
4 Replies

8. Shell Programming and Scripting

awk - read from a file and write conditional output

I have a file, which has '|' as separator; I need to read each line from that file and produce output to another file. While reading, I have certain condition on few specific columns (like column3 ='good'); only those lines will be processed. (3 Replies)
Discussion started by: mady135
3 Replies

9. UNIX for Beginners Questions & Answers

Conditional Split

Greetings, I need help in splitting the files in an efficient way while accommodating the below requirements . I am on AIX. Split condition Split the file based on the record type and the position of the data pattern that appears on the on the record type. Both record type and and the... (9 Replies)
Discussion started by: techedipro
9 Replies

10. Shell Programming and Scripting

awk conditional operators- lookup value in 2nd file

I use conditional operators alot in AWK to print rows from large text files based on values in a certain column. For example: awk -F '\t' '{ if ($1 == "A" || $1 == "C" ) print $0}' OFS="\t" file1.txt > file2.txt In this case every row is printed from file1 to file2 for which the column 1... (5 Replies)
Discussion started by: Geneanalyst
5 Replies
IGAWK(1)							 Utility Commands							  IGAWK(1)

NAME
igawk - gawk with include files SYNOPSIS
igawk [ all gawk options ] -f program-file [ -- ] file ... igawk [ all gawk options ] [ -- ] program-text file ... DESCRIPTION
Igawk is a simple shell script that adds the ability to have ``include files'' to gawk(1). AWK programs for igawk are the same as for gawk, except that, in addition, you may have lines like @include getopt.awk in your program to include the file getopt.awk from either the current directory or one of the other directories in the search path. OPTIONS
See gawk(1) for a full description of the AWK language and the options that gawk supports. EXAMPLES
cat << EOF > test.awk @include getopt.awk BEGIN { while (getopt(ARGC, ARGV, "am:q") != -1) ... } EOF igawk -f test.awk SEE ALSO
gawk(1) Effective AWK Programming, Edition 1.0, published by the Free Software Foundation, 1995. AUTHOR
Arnold Robbins (arnold@skeeve.com). Free Software Foundation Nov 3 1999 IGAWK(1)
All times are GMT -4. The time now is 03:00 PM.
Unix & Linux Forums Content Copyright 1993-2022. All Rights Reserved.
Privacy Policy