sed/awk script selective insert between lines


 
Thread Tools Search this Thread
Top Forums Shell Programming and Scripting sed/awk script selective insert between lines
# 1  
Old 03-28-2009
sed/awk script selective insert between lines

Hi
I have a file in the foll. format
*RECORD*
*FIELD NO*
.......
.......
*FIELD TX*
Data
*FIELD AV*
Data
*FIELD RF*


*RECORD*
*FIELD NO*
.......
.......
*FIELD TX*
Data
*FIELD RF*


i.e. Some records have *FIELD AV* between *FIELD TX* and *FIELD RF*

I want to insert *FIELD AV* between *FIELD TX* and *FIELD RF* IF it does not exist already.

Any input on such a script would be helpful
# 2  
Old 03-29-2009
Please provide sample input and expected output. The sample input should cover all cases
# 3  
Old 03-29-2009
There may be a way in 'awk' to do this but I tend to use 'sed' on the command line and 'perl' for more complex tasks.

Code:
#!/usr/bin/perl -w

open(IN, "< test");
open(OUT, ">output");

$AV=0;

while ($line = <IN>) {
  if ($line =~ m/FIELD AV/) {
    $AV=1;
    print OUT "$line";
  }
  elsif ($line =~ m/FIELD RF/) {
    if ($AV == 0) {
      print OUT "*FIELD AV*\n";
      $AV=0;
    }
    else {
      $AV=0;
    }
    print OUT "$line";
  }
  else {
    print OUT "$line";
  }
}

close(IN);
close(OUT);


Last edited by ldapswandog; 03-29-2009 at 12:21 PM..
# 4  
Old 03-29-2009
Sample Input and expected output

Here's the sample input data. There are 2 records delimited by *RECORD*.
I want to extract data in the *FIELD* <fieldName>

-----------------------------------------------------------------------------------
*RECORD*
*FIELD* NO
100050
*FIELD* TI
100050 AARSKOG SYNDROME
*FIELD* TX
Grier et al. (1983) reported father and 2 sons with typical Aarskog
syndrome, including short stature, hypertelorism, and shawl scrotum.
They tabulated the findings in 82 previous cases.

*FIELD RF*
1. Grier, R. E.; Farrington, F. H.; Kendig, R.; Mamunes, P.: Autosomal
dominant inheritance of the Aarskog syndrome. Am. J. Med. Genet. 15:
39-46, 1983.


*RECORD*
*FIELD* NO
100650
*FIELD* TI
+100650 ALDEHYDE DEHYDROGENASE 2 FAMILY; ALDH2
;;ALDEHYDE DEHYDROGENASE 2;;

*FIELD* TX

DESCRIPTION

Acetaldehyde dehydrogenase (EC 1.2.1.3) is the next enzyme after alcohol
dehydrogenase (see 103700) in the major pathway of alcohol metabolism.
There are 2 major ALDH isozymes in the liver: cytosolic ALDH1 (ALDH1A1;
100640) and mitochondrial ALDH2.

CLONING

*FIELD* AV
.0001
ALCOHOL SENSITIVITY, ACUTE
HANGOVER, SUSCEPTIBILITY TO, INCLUDED;;

The designation for the ALDH2*2 polymorphism has been changed from
GLU487LYS to GLU504LYS. The numbering change includes the N-terminal
mitochondrial leader peptide of 17 amino acids (Li et al., 2006).

*FIELD* RF
1. Agarwal, D. P.; Harada, S.; Goedde, H. W.: Racial differences
in biological sensitivity to ethanol: the role of alcohol dehydrogenase
and aldehyde dehydrogenase isozymes. Alcoholism 5: 12-16, 1981.

3. Braun, T.; Grzeschik, K. H.; Bober, E.; Singh, S.; Agarwal, D.
P.; Goedde, H. W.: The structural gene for the mitochondrial aldehyde
dehydrogenase maps to human chromosome 12. Hum. Genet. 73: 365-367,1986.

-------------------------------------------------------------------------
DESIRED OUTPUT
-------------------------------------------------------------------------

*RECORD*
*FIELD* NO
100050
*FIELD* TI
100050 AARSKOG SYNDROME
*FIELD* TX
Grier et al. (1983) reported father and 2 sons with typical Aarskog
syndrome, including short stature, hypertelorism, and shawl scrotum.
They tabulated the findings in 82 previous cases.

*FIELD* AV <---- This is the new entry
*FIELD* RF
1. Grier, R. E.; Farrington, F. H.; Kendig, R.; Mamunes, P.: Autosomal
dominant inheritance of the Aarskog syndrome. Am. J. Med. Genet. 15:
39-46, 1983.


*RECORD*
*FIELD* NO
100650
*FIELD* TI
+100650 ALDEHYDE DEHYDROGENASE 2 FAMILY; ALDH2
;;ALDEHYDE DEHYDROGENASE 2;;

*FIELD* TX

DESCRIPTION

Acetaldehyde dehydrogenase (EC 1.2.1.3) is the next enzyme after alcohol
dehydrogenase (see 103700) in the major pathway of alcohol metabolism.
There are 2 major ALDH isozymes in the liver: cytosolic ALDH1 (ALDH1A1;
100640) and mitochondrial ALDH2.

CLONING

*FIELD* AV
.0001
ALCOHOL SENSITIVITY, ACUTE
HANGOVER, SUSCEPTIBILITY TO, INCLUDED;;

The designation for the ALDH2*2 polymorphism has been changed from
GLU487LYS to GLU504LYS. The numbering change includes the N-terminal
mitochondrial leader peptide of 17 amino acids (Li et al., 2006).

*FIELD* RF
1. Agarwal, D. P.; Harada, S.; Goedde, H. W.: Racial differences
in biological sensitivity to ethanol: the role of alcohol dehydrogenase
and aldehyde dehydrogenase isozymes. Alcoholism 5: 12-16, 1981.

3. Braun, T.; Grzeschik, K. H.; Bober, E.; Singh, S.; Agarwal, D.
P.; Goedde, H. W.: The structural gene for the mitochondrial aldehyde
dehydrogenase maps to human chromosome 12. Hum. Genet. 73: 365-367,1986.

---------------------------------------------------------------------------------

Goal: Introduce the term "*FIELD* AV" between *FIELD* TX and *FIELD* RF if it does not already exist.

Background:
The data is a free form text file in which the below order of fields must be maintained for every record.
*FIELD* NO
*FIELD* TI
*FIELD* TX
*FIELD* AV
*FIELD* RF

Currently the first 3 fields are being loaded correctly using bulk data loading utilities (SQL*Loader, etc.) but since the 4th field is missing in some records, it introduces a one-off error.

Any input on some preprocessing on the file using sed/awk to continue using SQL*Loader would be helpful
# 5  
Old 03-29-2009
SED/ AWK to do selective insert

I would appreciate if any one could give a handy one-liner in sed/awk to insert a line between two lines if and only if a condition is NOT fulfilled

sample input record:
*field* 1
*field* 2
*field* 3

*field* 1
*field* 3


sample output record:
*field* 1
*field* 2
*field* 3

*field* 1
*field* 2
*field* 3


insert word between two lines if word does not occur
else
leave as is.

I know to do this in perl, but I am just looking for an awk/sed command that would accomplish the same.
Login or Register to Ask a Question

Previous Thread | Next Thread

10 More Discussions You Might Find Interesting

1. Shell Programming and Scripting

Using sed to insert text between lines

Hello, I am trying to insert a section of text between lines in another text file. The new lines to be inserted are: abcd.efgh.zzzz=blah abcd.efgh.xxxx=blah Where N = 0 to 2 Original File: abcd.efgh.wwxx=aaaaa abcd.efgh.yyzz=bbbbb abcd.efgh.wwxx=aaaaa abcd.efgh.yyzz=bbbbb... (3 Replies)
Discussion started by: tsu3000
3 Replies

2. Shell Programming and Scripting

Sed; insert text two lines above match

Hi! Considering below text, how would I use sed to insert text right below the v0005-line, using the SEPARATOR-line as a pattern to search for, so two lines above the separator? I can do it right above the separator, but not 2 lines... # v0004 - Some text # v0005 - More text #... (5 Replies)
Discussion started by: indo1144
5 Replies

3. UNIX for Dummies Questions & Answers

sed command to Insert a line before the last four lines of the file

By using sed command, How to insert a new line before the last four lines of the file. Old Line Old Line NEW LINE! Old Line Old Line Old Line Old Line (8 Replies)
Discussion started by: wridler
8 Replies

4. Shell Programming and Scripting

How to substract selective values in multi row, multi column file (using awk or sed?)

Hi, I have a problem where I need to make this input: nameRow1a,text1a,text2a,floatValue1a,FloatValue2a,...,floatValue140a nameRow1b,text1b,text2b,floatValue1b,FloatValue2b,...,floatValue140b look like this output: nameRow1a,text1b,text2a,(floatValue1a - floatValue1b),(floatValue2a -... (4 Replies)
Discussion started by: nricardo
4 Replies

5. UNIX for Dummies Questions & Answers

Selective Replacements: Using sed or awk to replace letters with numbers in a very specific way

Hello all. I am a beginner UNIX user who is using UNIX to work on a bioinformatics project for my university. I have a bit of a complicated issue in trying to use sed (or awk) to "find and replace" bases (letters) in a genetics data spreadsheet (converted to a text file, can be either... (3 Replies)
Discussion started by: Mince
3 Replies

6. Shell Programming and Scripting

sed insert text 2 lines above pattern

Hi I am trying to insert a block of text 2 lines above a pattern match using sed eg #Start of file entry { } #End of file entry new bit of text has to be put in just above the } eg #Start of file entry { New bit of text } #End of file entry (7 Replies)
Discussion started by: eeisken
7 Replies

7. Shell Programming and Scripting

Insert few lines above a match using sed, and within a perl file.

Greetings all, I am trying to match a string, and after that insert a few lines above that match. The string is "Version 1.0.0". I need to insert a few lines ONLY above the first match (there are many Version numbers in the file). The rest of the matches must be ignored. The lines I need to... (2 Replies)
Discussion started by: nagaraj s
2 Replies

8. Shell Programming and Scripting

sed - insert two lines

I have done this sed command to insert one line after a specific string is found: sed '/patternstring/ a\ new line string' file1 But how do I insert two lines? This is not possible: sed '/patternstring/ a\ new line string \a new line string 2' file1 (2 Replies)
Discussion started by: locoroco
2 Replies

9. Shell Programming and Scripting

sed/awk to insert multiple lines before pattern

I'm attempting to insert multiple lines before a line matching a given search pattern. These lines are generated in a separate function and can either be piped in as stdout or read from a temporary file. I've been able to insert the lines from a file after the pattern using: sed -i '/pattern/... (2 Replies)
Discussion started by: zksailor534
2 Replies

10. Shell Programming and Scripting

Sed insert and Change lines help needed

Hi, How can i use insert and change command in ksh shell. I am using : sed -e '1i\TEXTTOBEINSERTED\' FILENAME But there is no effect... Also sed -e 'c\thisischange\' Filename Please Explain how to proceed?? (2 Replies)
Discussion started by: JunkYardWars
2 Replies
Login or Register to Ask a Question