Adding a new column as sequential number but with a little complication


 
Thread Tools Search this Thread
Top Forums Shell Programming and Scripting Adding a new column as sequential number but with a little complication
# 1  
Old 01-05-2012
Adding a new column as sequential number but with a little complication

I am a newbie to shell programming and maybe somebody can help me out a little. Here's my problem:

I got a PIPE delimited file with header record. I need to add a new column name as RECORDKEY. I would like to use a counter to generate this new value for each record. I plan to do a while loop and adding the key (e.g. 1-100). To complicate the logic a bit, I need to check column4 and compare it to the value of previous column4. If they match then I don't need to do add +1(cnt=cnt+1) and still use the current value of "cnt".

I appreciate all help! I tried looking at different samples but I still cannot build a solid foundation to the problem I have. Smilie

INPUT FILE:
Code:
"SA ID"|"PER ID"|"SP ID"|"ACCT ID"|"PREM ID"|
"4809192005"|"8548667904"|"8579918510"|"4809192892"|"8579918555"|
"3242844005"|"8149108497"|"8138547810"|"3242844114"|"8138547802"|
"1039653968"|"8060515953"|"8006519921"|"1039653537"|"8006519920"|
"8864439005"|"3291253901"|"3406088605"|"8864439315"|"3406088600"|
"9746328005"|"8006660653"|"8006564710"|"9746328430"|"8006564769"|
"7655489005"|"2113888404"|"2165831410"|"9746328430"|"2165831429"|
"2833100005"|"7770432791"|"7770421305"|"2833100003"|"7770421334"|
"3842956005"|"2509734222"|"2551230310"|"3842956375"|"2551230340"|

DESIRED OUTPUT FILE:
Code:
"KEYRECORD"|"SA ID"|"PER ID"|"SP ID"|"ACCT ID"|"PREM ID"|
1|"4809192005"|"8548667904"|"8579918510"|"4809192892"|"8579918555"|
2|"3242844005"|"8149108497"|"8138547810"|"3242844114"|"8138547802"|
3|"1039653968"|"8060515953"|"8006519921"|"1039653537"|"8006519920"|
4|"8864439005"|"3291253901"|"3406088605"|"8864439315"|"3406088600"|
5|"9746328005"|"8006660653"|"8006564710"|"9746328430"|"8006564769"|
5|"7655489005"|"2113888404"|"2165831410"|"9746328430"|"2165831429"|
6|"2833100005"|"7770432791"|"7770421305"|"2833100003"|"7770421334"|
7|"3842956005"|"2509734222"|"2551230310"|"3842956375"|"2551230340"|

As you can see, line 5 and 6 has the same $4 value. So I don't want to change the KEYRECORD value.


Moderator's Comments:
Mod Comment Please use code tags, check PM for a guide.

Last edited by zaxxon; 01-05-2012 at 08:03 PM.. Reason: code tags
# 2  
Old 01-05-2012
I think this might work:

Code:
awk -F "|" '
    NR == 1 { print "\"KEYRECORD\"|" $0; next; }
    {
        if( !last ||  last != $4 )
            n++;
        print n "|" $0;
        last = $4;
    }
' input-file

# 3  
Old 01-09-2012
Thanks for the reply! Everything worked!

This time I have to check first if $4 is blank or not. Then create a new record key if $5 does not match the previous value.


My problem right now is my 2nd IF statement. For some reason it is not working. My desired output is to create a new key record if $4 is not blank.

Code:
awk -F \| '
    NR == 1 { print "\"KEYRECORD\"|" $0; next; }
    {
        if( $4 -eq 0 || !last ||  last != $4 || last5 != $5) {
            echo "\nProcess black account id....\n";
            n++;
        } 
        else 
        if( !last ||  last != $4 ) {
            echo "\nProcess nonblank account id....\n";
            n++;
        }        
        print n "|" $0;
        last = $4;
        last5 = $5;

    }
'  $CUSTFINAL > $CUSTFINAL2


Last edited by johnhips; 01-09-2012 at 04:19 PM..
# 4  
Old 01-09-2012
The expression syntax in awk is like C and not like that of the test function. Try this change:


Code:
awk -F \| '
    NR == 1 { print "\"KEYRECORD\"|" $0; next; }
    {
        if( $4 == "\"0\"" || !last ||  last != $4 || last5 != $5) {
            echo "\nProcess black account id....\n";
            n++;
        } 
        else 
        if( !last ||  last != $4 ) {
            echo "\nProcess nonblank account id....\n";
            n++;
        }        
        print n "|" $0;
        last = $4;
        last5 = $5;

    }
'  $CUSTFINAL > $CUSTFINAL2


Because your data is enclosed in quotes, you must include the quotes in the comparison.
This User Gave Thanks to agama For This Post:
# 5  
Old 01-09-2012
This is my desired output and I cannot get it to work.

Code:
"KEYRECORD"|"SA ID"|"PER ID"|"SP ID"|"ACCT ID"|"PREM ID"|"PHONE"|"NA
1|"          "|"          "|"3197691605"|"          "|"3197691657"|"
2|"          "|"          "|"3246309010"|"          "|"3246309029"|"
3|"          "|"          "|"3350909705"|"          "|"3311776290"|"
3|"          "|"          "|"3311776205"|"          "|"3311776290"|"
4|"          "|"          "|"3329099805"|"          "|"3329099858"|"
5|"          "|"          "|"3621605305"|"          "|"3621605356"|"
6|"0001344005"|"3157734331"|"3209921605"|"0001344441"|"3209921631"|"
6|"7189943005"|"3231403121"|"3210736405"|"0001344441"|"3210736492"|"
7|"0006746005"|"3287909287"|"3340056905"|"0006746763"|"3340056937"|"
8|"0007922990"|"5053661026"|"3519545205"|"0007922859"|"3519545208"|"
9|"0008717005"|"3508935326"|"3519329105"|"0008717685"|"3519329168"|
10|"0008969005"|"7117287080"|"3613118705"|"0008969048"|"3613118796"|
11|"0009160005"|"5094212566"|"3467460805"|"0009160056"|"3467460835"|

---------- Post updated at 06:18 PM ---------- Previous update was at 05:52 PM ----------

You are the best! I cannot figure how to compare the enclosed quotes and this fixed my code!! Thanks again!!
Login or Register to Ask a Question

Previous Thread | Next Thread

10 More Discussions You Might Find Interesting

1. Shell Programming and Scripting

Adding sequential index to duplicate strings

I have a text file in the following format >Homo sapiens KQKCLYNLPFKRNLEGCRERCSLVIQIPRCCKGYFGRDCQACPGGPDAPCNNRGVCLDQY SATGECKCNTGFNGTACEMCWPGRFGPDCLPCGCSDHGQCDDGITGSGQCLCETGWTGPS CDTQAVLPAVCTPPCSAHATCKENNTCECNLDYEGDGITCTVVDFCKQDNGGCAKVARCS... (2 Replies)
Discussion started by: jerrild
2 Replies

2. Shell Programming and Scripting

Adding number in one column

Hello I have something like this a 1 b 1 c 1 d 1 e 1 This is inside 1.dat and this is what I am trying to get a 2 b 2 c 2 d 2 e 2 (4 Replies)
Discussion started by: jeo_fb
4 Replies

3. Shell Programming and Scripting

Adding values of a column based on another column

Hello, I have a data such as this: ENSGALG00000000189 329 G A 4 2 0 ENSGALG00000000189 518 T C 5 1 0 ENSGALG00000000189 1104 G A 5 1 0 ENSGALG00000000187 3687 G T 5 1 0 ENSGALG00000000187 4533 A T 4 2 0 ENSGALG00000000233 5811 T C 4 2 0 ENSGALG00000000233 5998 C A 5 1 0 I want to... (3 Replies)
Discussion started by: Homa
3 Replies

4. Shell Programming and Scripting

Add markup tag and sequential number after specific line

Hello, This one has me a bit stumped. I have data the looks like, M END > <PREDICTION_ACCURACY> PROBABLE > <NO_OF_PARENTS> 3 > <CLOGP> -13.373 > <SMILES> OCC(O)C(OC1OC(CO)C(OC2OC(CO)C > <MIMW> 1006.322419888 (3 Replies)
Discussion started by: LMHmedchem
3 Replies

5. UNIX for Dummies Questions & Answers

Inserting a sequential number into a field on a flat file

I have a csv flatfile with a few million rows. I need to replace a field (field number is 85) in the file with a sequential number. As an example, let's assume there are only 4 fields in the file: A,A,,32 A,A,,27 A,B,,43 C,C,,354 If I wanted to amend the 3rd field in this way my... (2 Replies)
Discussion started by: BristolSmithy
2 Replies

6. Shell Programming and Scripting

Taking largest (negative) number from column of coordinates and adding positive form to every other

Hello all, I'm new to the forums and hope to be able to contribute something useful in the future; however I must admit that what has prompted me to join is the fact that currently I need help with something that has me at the end of my tether. I have a PDB (Protein Data Bank) file which I... (13 Replies)
Discussion started by: crunchgargoyle
13 Replies

7. Shell Programming and Scripting

Rename a header column by adding another column entry to the header column name URGENT!!

Hi All, I have a file example.csv which looks like this GrpID,TargetID,Signal,Avg_Num CSCH74_1_1,2007,61,256 CSCH74_1_1,212007,647,679 CSCH74_1_1,12007,3,32 CSCH74_1_1,207,299,777 I want the output as GrpID,TragetID,Signal-CSCH74_1_1,Avg_Num CSCH74_1_1,2007,61,256... (4 Replies)
Discussion started by: Vavad
4 Replies

8. Programming

Tool to simulate non-sequential disk I/O (simulate db file sequential read) in C POSIX

Writing a Tool to simulate non-sequential disk I/O (simulate db file sequential read) in C POSIX I have over the years come across the same issue a couple of times, and it normally is that the read speed on SAN is absolutely atrocious when doing non-sequential I/O to the disks. Problem being of... (7 Replies)
Discussion started by: vrghost
7 Replies

9. UNIX for Dummies Questions & Answers

Adding a column with the row number using awk

Is there anyway to use awk to add a first column to my data that automatically goes from 1 to n , where n is the numbers of my rows?:confused: (4 Replies)
Discussion started by: cosmologist
4 Replies

10. Programming

Reading special characters while converting sequential file to line sequential

We have to convert a sequential file to a 80 char line sequential file (HP UX platform).The sequential file contains special characters. which after conversion of the file to line sequential are getting coverted into "new line" or "tab" and file is getting distorted. Is there any way to read these... (2 Replies)
Discussion started by: Rajeshsu
2 Replies
Login or Register to Ask a Question