Grep and substitute?


 
Thread Tools Search this Thread
Top Forums Shell Programming and Scripting Grep and substitute?
# 1  
Old 07-18-2014
Grep and substitute?

I have to parse ASCII files, output the relevant data to a comma-delimited file and load it into a database table.

The specs for the file format have been recently updated and one section is causing problems. This is the original layout for that section.

Code:
    CSVHeaderAttr:PUIS,IdleImmediate,POH,Temp,WorstTemp
    CSVValuesAttr:NO,NO,9814,31,56

I parse it with `grep` thusly

Code:
    CSVAttributes=$(grep ^CSVValuesAttr:  ${filename}|cut -d':' -f2)
    [ -z "$CSVAttributes" ] && CSVAttributes="NA"

It works great but now that the section has new fields and they are named differently
Code:
CSVHeaderAttr:PUIS,IdleImmediateSupported,IdleImmediateEnabled,POH,Temp,WorstTemp
    CSVValuesAttr:NO,YES,YES,23861,31,51

Right now, I am grepping the files based on their layout (there is a field in the the header which tells me the version of the layout) to two different comma-delimited files and load them into two different tables. I would like to output both sections to the same file so the data scientist only has one table to use in his analysis.

Is there a way to use grep to produce an output like this and substitute empty fields with NA?

For one file type:
Code:
CSVHeaderAttr:PUIS,IdleImmediate,IdleImmediateSupported,IdleImmediateEnabled,POH,Temp,WorstTemp
    CSVValuesAttr:NO,NO,NA,NA,9814,31,56

For the other file type:
Code:
CSVHeaderAttr:PUIS,IdleImmediate,IdleImmediateSupported,IdleImmediateEnabled,POH,Temp,WorstTemp
    CSVValuesAttr:NO,NA,YES,YES,23861,31,51

Thanks for your input.
# 2  
Old 07-18-2014
When the question is "grep and", awk is usually the answer. It has a language statement especially built for 'if this line matches this regex, do x'.

This program checks what fields belong where depending on the header in the file you give it, and the header you give it in OUT=. This lets you use the exact same program on either format.

Code:
$ awk '(NR==1) { for(N=1; N<=NF; N++){ F[$N]=N ; F[N]=$N } ; next }
{ for(N=1; N<=NF; N++) D[F[N]]=$N ; next }
END {
        print OUT;
        split(OUT, OF);
        for(N=1; N in OF; N++) 
                if(OF[N] in D) $N=D[OF[N]] else $N="NA";
	print;
}' FS="," OFS="," OUT="CSVHeaderAttr:PUIS,IdleImmediate,IdleImmediateSupported,IdleImmediateEnabled,POH,Temp,WorstTemp" oldformat

CSVHeaderAttr:PUIS,IdleImmediate,IdleImmediateSupported,IdleImmediateEnabled,POH,Temp,WorstTemp
    CSVValuesAttr:NO,NO,NA,NA,9814,31,56

$ awk '(NR==1) { for(N=1; N<=NF; N++){ F[$N]=N ; F[N]=$N } ; next }
{ for(N=1; N<=NF; N++) D[F[N]]=$N ; next }
END {
        print OUT;
        split(OUT, OF);
        for(N=1; N in OF; N++)
        {
                $N="NA";
                if(OF[N] in D) $N=D[OF[N]];
        }
print;
}' FS="," OFS="," OUT="CSVHeaderAttr:PUIS,IdleImmediate,IdleImmediateSupported,IdleImmediateEnabled,POH,Temp,WorstTemp" newformat

CSVHeaderAttr:PUIS,IdleImmediate,IdleImmediateSupported,IdleImmediateEnabled,POH,Temp,WorstTemp
    CSVValuesAttr:NO,NA,YES,YES,23861,31,51

$

# 3  
Old 07-19-2014
If your samples are representative, this might do the job:
Code:
awk     'FNR==1 {print "CSVHeaderAttr:PUIS,IdleImmediate,IdleImmediateSupported,IdleImmediateEnabled,POH,Temp,WorstTemp"; TYPE=7-NF; next}
                {printf "%s:%s,%s,%s,%s,%s,%s\n", $1, $2, TYPE?$3:"NA", TYPE?"NA,NA":$3","$4, $(NF-2), $(NF-1), $NF }
        ' FS="[:,]"  file[12]
CSVHeaderAttr:PUIS,IdleImmediate,IdleImmediateSupported,IdleImmediateEnabled,POH,Temp,WorstTemp
CSVValuesAttr:NO,NO,NA,NA,9814,31,56
CSVHeaderAttr:PUIS,IdleImmediate,IdleImmediateSupported,IdleImmediateEnabled,POH,Temp,WorstTemp
CSVValuesAttr:NO,NA,YES,YES,23861,31,51

Login or Register to Ask a Question

Previous Thread | Next Thread

10 More Discussions You Might Find Interesting

1. Shell Programming and Scripting

Substitute grep command at run time

HI I am trying to use the following code in the shell script (using grep) usage() { echo "Usage: ./$0 <file name> <interval> <pattern>" } METRICS_FILE=$1 INTERVAL=$2 PATTERN="$3" .. if then PATTERN="grep Gx" fi COUNT=`cat ${METRICS_FILE} | "${PATTERN}" |egrep... (8 Replies)
Discussion started by: asifansari
8 Replies

2. Shell Programming and Scripting

Grep for string and substitute if exits with a yes/no

Hi I have 3 files in total. file 1 is enriched.txt file2 is repressed.txt and file 3 is my content.txt What i need is query the content file against both enriched and repressed and wherever the gensymbol is same in both the files then add a yes value against it file1 Gene ABC XYZ MNO... (12 Replies)
Discussion started by: Diya123
12 Replies

3. UNIX for Dummies Questions & Answers

Substitute in VI

Hi there, i am updating a file on UNIX and have many lines as per below : listen:x:37:4:Network Admin:/usr/net/nls: i would like to substitute from the :/usr to the end of the line. so at the moment im using this : :s/"\/$/ /g but i get an error.can anyone help? thank you (3 Replies)
Discussion started by: brian112
3 Replies

4. UNIX for Dummies Questions & Answers

substitute (')

I usually use : Code: awk '{gsub(/xxx/,"yyy");print}' to substitute xxx with yyy. I have a problem substitute an expression like Code: x ' y Because of the ( ' ) Any idea on how to get over this problem? Thanks (2 Replies)
Discussion started by: cosmologist
2 Replies

5. Shell Programming and Scripting

How do I grep with sed and substitute a "#" on that line

Hey all, I am trying to disable a certain cronjob before I run a backup. I want to be able to add/remove a "#" from the beginning on the crontab line it is located on. Here is the crontab: 46 11 * * * /etc/webmin/cron/tempdelete.pl @daily /etc/webmin/time/sync.pl */5 * * * *... (4 Replies)
Discussion started by: eg mike
4 Replies

6. Shell Programming and Scripting

vi substitute

My question is how would I substitute for ceratain number of occurences in a line? If this is my input rjohns BFSTDBS01 Standard Silver NPRO30DINCR 2 Client Is it possible to change the first 3 occurences of space " " to a comma? (7 Replies)
Discussion started by: reggiej
7 Replies

7. Shell Programming and Scripting

How to substitute?

Hi, I have query terms like this: a) apple bannana b) apple bannana AND chickko c) "milk shake" OR Graphes orange whereever there is space substitue with AND operator. I tried like this: (2 Replies)
Discussion started by: vanitham
2 Replies

8. Shell Programming and Scripting

In Help, Substitute Text ...

i'm writing a script that will extract and substitute a certain part of a data. i'm having trouble with the substituting part ... Here's my data looks like: 01/01/08-001-23:46:18-01/01/08-23:50:43 01/01/08-003-23:45:19-01/01/08-23:55:49 01/01/08-005-23:52:18-01/01/08-23:58:52 i want to... (6 Replies)
Discussion started by: solidhelix08
6 Replies

9. UNIX for Dummies Questions & Answers

Substitute in vi

I know in vi you can do :%s/replaceme/withthis/ but if i want to find all lines say without a # at the begining and I want to put it in how would that command be formatted? I can't figure it out for the life of me. #comment blah1 hey1 grrr1 #comment #blah1 #hey1 #grrr1 (5 Replies)
Discussion started by: kingdbag
5 Replies

10. Shell Programming and Scripting

substitute the grep output

I have a file name called fruits. In this file the prices keep on changing & the order in which fruits are listed keep on changing. $ cat fruits fruitname price/pound redapples 30 grapes 50 oranges 20 $echo $custom_price 35 What I want to do is that if the file "fruits" contains... (1 Reply)
Discussion started by: jasmeet100
1 Replies
Login or Register to Ask a Question