Visit Our UNIX and Linux User Community


Convert fixed value fields to comma separated values


 
Thread Tools Search this Thread
Top Forums Shell Programming and Scripting Convert fixed value fields to comma separated values
# 1  
Old 07-13-2018
Convert fixed value fields to comma separated values

Hi All,

Hope you are doing Great!!!.

Today i have came up with a problem to say exactly it was for performance improvement.

I have written code in perl as a solution for this to cut in specific range, but it is taking time to run for files thousands of lines so i am expecting

a sed command kind of thing to make it run quickly.

input format:

Code:
--> TP  ID: TEST TP                        XLATE KEY:   ANSIXX99  AL:   D
      INT ID: TESTREFORMAT            XLATE TABLE: X820XR99
      DOC ID: 820    DIR: I STD: ANSI     COM: X      VERS: NONCTX       STAT: P

--> TP  ID: TEST TP                             XLATE KEY:   ANSIXX41  AL:   D
      INT ID: TESTREFORMAT                        XLATE TABLE: X820XR99
      DOC ID: 820    DIR: I STD: ANSI     COM: X      VERS: 004010       STAT: P

--> TP  ID: TEST TP                             XLATE KEY:   XXXXXXXX   AL:   D
     INT ID: TESTREFORMAT                        XLATE TABLE: XXXXXXXX
     DOC ID: 820    DIR: I STD: ANSI     COM: X      VERS: 004010       STAT: T

output format required:

Code:
TEST TP,ANSIXX99,D,TESTREFORMAT,X820XR99,820,I,ANSI,X,NONCTX,P
TEST TP,ANSIXX41,D,TESTREFORMAT,X820XR99,820,I,ANSI,X,004010,P
TEST TP,XXXXXXXX,D,TESTREFORMAT,XXXXXXXX,820,I,ANSI,X,004010,T

I have a file with input format given as example in above.

Rules for input format:
  • The "--> TP ID:" will repeat for every three lines.
    All the values after ":" were variable in length with in length given to it but all the values before ":" was fixed in length.

So our motivation was to make the values after ":" to be like in output format with comma separated value.

I am using AIX V6.0 OS. SED command as solution will be a preferable .

Thanks.

---------- Post updated at 11:03 PM ---------- Previous update was at 11:01 PM ----------

I have pasted actual data but i could see a empty line in between every 3 lines but in actual data there will not be any empty lines in input file.
# 2  
Old 07-13-2018
It looks like sed could be useful here. This is how you could start your control file (file name after sed -f).
Code:
/-->/ {  N
         N
         ....................
         s/  */,/g
       }

From here you have all three input records in your pattern space, separated by "\n" character. A set of "s" commands can take out the "boiler plate" constants and the new-line characters. The last "s" will make it ready to output, there are two spaces in front of the asterisk.
# 3  
Old 07-13-2018
Your problem is - at least to me - unsolvable. You have one or multiple spaces as field separators, plus one or more spaces in the field values themselves. So you can't reliably and consistently tell values from labels etc. Anything proposed would be quite hazardous...
# 4  
Old 07-13-2018
This is as far as I can get:
Code:
awk '
        {getline X
         getline Y
         $0 = $0 FS X FS Y
         sub (/^--> TP *ID: /, _)
         FS = "|"
         gsub (/  +/, FS)

         $1 = $1
         for (i=1; i<=NF; i++)  sub (/^.*: */, _, $i)
         gsub (/,,/, ",")
         print
        }
' OFS=, file
TEST TP,ANSIXX99,D,TESTREFORMAT,X820XR99,820,ANSI,X,NONCTX,P
TEST TP,ANSIXX41,D,TESTREFORMAT,X820XR99,820,ANSI,X,004010,P
TEST TP,XXXXXXXX,D,TESTREFORMAT,XXXXXXXX,820,ANSI,X,004010,T

I can't isolate the DIR: I stuff. . . and, as said before, I wouldn't rely on this proposal .

Last edited by RudiC; 07-14-2018 at 04:59 AM..
This User Gave Thanks to RudiC For This Post:
# 5  
Old 07-15-2018
Quote:
Originally Posted by RudiC
Your problem is - at least to me - unsolvable. You have one or multiple spaces as field separators, plus one or more spaces in the field values themselves. So you can't reliably and consistently tell values from labels etc. Anything proposed would be quite hazardous...
RudiC thanks for reply but i strongly believe that this is solvable since field labels (The values in left side of Smilie are in fixed position. What do you think about this since this data is in pattern?
# 6  
Old 07-15-2018
If we go to post #1 in this thread and look at the XLATE KEY, XLATE TABLE, and COM fields in your sample input, it is immediately obvious that your fields are not fixed width (i.e., they do not appear at the same locations in each record).

Since your fields are not fixed width, since you have not defined what fields are present in general (as opposed to three record sample including blank lines that you say are not present in your real data), and since we have no idea what fields will be present or where they will be located in your real data; there is little we can do to help you solve your problem.
# 7  
Old 07-16-2018
Quote:
Originally Posted by Don Cragun
If we go to post #1 in this thread and look at the XLATE KEY, XLATE TABLE, and COM fields in your sample input, it is immediately obvious that your fields are not fixed width (i.e., they do not appear at the same locations in each record).

Since your fields are not fixed width, since you have not defined what fields are present in general (as opposed to three record sample including blank lines that you say are not present in your real data), and since we have no idea what fields will be present or where they will be located in your real data; there is little we can do to help you solve your problem.
Hi Don,
The the xlate key, xlate table, com were not on the same position in 3 records but they repeat in the same position for every 3 records. The sample which i posted here is for demo data. I attached the sample data input file. Please let me know your thoughts after seeing this.

Previous Thread | Next Thread
Test Your Knowledge in Computers #608
Difficulty: Easy
Python does not need compilation to binary to run.
True or False?

10 More Discussions You Might Find Interesting

1. UNIX for Beginners Questions & Answers

How to extract fields from a CSV i.e comma separated where some of the fields having comma as value?

can anyone help me!!!! How to I parse the CSV file file name : abc.csv (csv file) The above file containing data like abv,sfs,,hju,',',jkk wff,fst,,rgr,',',rgr ere,edf,erg,',',rgr,rgr I have a requirement like i have to extract different field and assign them into different... (4 Replies)
Discussion started by: J.Jena
4 Replies

2. Shell Programming and Scripting

Comma separated values to individual lines

My OS : RHEL 6.7 I have a text file with comma separated values like below $ cat testString.txt 'JOHN' , 'KEITH' , 'NEWMAN' , 'URSULA' , 'ARIANNA' , 'CHENG', . . . . I want these values to appear like below 'JOHN' , 'KEITH' , 'NEWMAN' , 'URSULA' , 'ARIANNA' , 'CHENG', .... (4 Replies)
Discussion started by: kraljic
4 Replies

3. Shell Programming and Scripting

Convert column to quote and comma separated row

Hi, I have a list of tables in a file.txt C_CLAIM C_HLD C_PROVIDER I want the output to be 'C_CLAIM','C_HLD','C_PROVIDER' Currently I'm usin awk and getting output which is almost correct but still has minor defects awk -vORS="','" '{ print $1 }' file.txt The output of... (4 Replies)
Discussion started by: wahi80
4 Replies

4. Shell Programming and Scripting

Needs help in parsing comma separated values

hello experts, i am retrieving values in variables jobKey and jobName within my shell script. these values are returned to me within braces and i am using following command to remove those braces: jobKeys=`echo $jobKeys | sed 's:^.\(.*\).$:\1:'` jobNames=`echo $jobNames | sed... (1 Reply)
Discussion started by: avikaljain
1 Replies

5. Shell Programming and Scripting

Convert comma separated file to fix length

Hi, I am converting a comma separated file to fixed field lenght and I am using that: COLUMNS="25 24 67 26 39 63 20 34 35 14 397" ( cat $indir/input_file.dat | \ $AWK -v columns="$COLUMNS" ' BEGIN { FS=","; OFS=""; split(columns, arr, " "); } { for(i=1; i<=NF;... (5 Replies)
Discussion started by: apenkov
5 Replies

6. UNIX for Dummies Questions & Answers

[solved] Comma separated values to space separated

Hi, I have a large number of files which are written as csv (comma-separated values). Does anyone know of simple sed/awk command do achieve this? Thanks! ---------- Post updated at 10:59 AM ---------- Previous update was at 10:54 AM ---------- Guess I asked this too soon. Found the... (0 Replies)
Discussion started by: lost.identity
0 Replies

7. Shell Programming and Scripting

To agregate Comma separated values

Hi pls help me to get the code: i have a file in which content is : 2.01304E+11 2.01304E+11 ori 2 01:00 2.01304E+11 2.01304E+11 ori 2 01:02 2.01304E+11 2.01304E+11 ori 3 01:02 2.01304E+11 2.01304E+11 ori 3 ... (7 Replies)
Discussion started by: Aditya.Gurgaon
7 Replies

8. Shell Programming and Scripting

Extracting the values separated by comma

Hi, I have a variable which has a list of string separated by comma. for ex , Variable=/usr/bin,/usr/smrshbin,/tmp How can i get the values between the commas separately using shell scripts.Please help me. Thanks, Padmini. (6 Replies)
Discussion started by: padmisri
6 Replies

9. Shell Programming and Scripting

Parse apart strings of comma separated data with varying number of fields

I have a situation where I am reading a text file line-by-line. Those lines of data contain comma separated fields of data. However, each line can vary in the number of fields it can contain. What I need to do is parse apart each line and write each field of data found (left to right) into a file.... (7 Replies)
Discussion started by: 2reperry
7 Replies

10. UNIX for Dummies Questions & Answers

Remove whitespaces between comma separated fields from file

Hello all, I am a unix dummy. I am trying to remove spaces between fields. I have the file in the following format 12332432, 2345 , asdfsdf ,100216 , 9999999 12332431, 2341 , asdfsd2 ,100213 , 9999999 &... (2 Replies)
Discussion started by: nitinbjoshi
2 Replies

Featured Tech Videos