Converting fixed width file to pipe delimiter in Linux(red-hat)


 
Thread Tools Search this Thread
Operating Systems Linux Red Hat Converting fixed width file to pipe delimiter in Linux(red-hat)
# 1  
Old 02-05-2015
Converting fixed width file to pipe delimiter in Linux(red-hat)

Hi,
I am facing a typical scenario for AWK command .
In HP- UNIX is behave as expected but in red hat linux same awk code is not give the same result.

The below code is for convert the fixed width file to pipe delimiter file in HP-unix server.
awk code:
Code:
#!/bin/awk -f

NR!=1 {while(substr($0,1,2)!="GA")
      {gsub(/\|/,"-",$0); if(substr($0,1,1)=="H"&&substr($0,343,39)!="RUA DR. EDUARDO SANTOS SILVA 261 - FRAC"){print substr($0,1,11)"|" substr($0,12,12)"|" substr($0,24,4)"|" substr($0,28,3)"|" substr($0,31,40)"|" substr($0,71,14)"|" substr($0,85,2)"|" substr($0,87,4)"|" substr($0,91,10)"|" substr($0,101,46)"|" substr($0,147,35)"|" substr($0,182,8)"|" substr($0,190,2)"|" substr($0,192,2)"|" substr($0,194,1)"|" substr($0,195,2)"|" substr($0,197,2)"|" substr($0,199,2)"|" substr($0,201,2)"|" substr($0,203,2)"|" substr($0,205,8)"|" substr($0,213,16)"|" substr($0,229,3)"|" substr($0,232,3)"|" substr($0,235,2)"|" substr($0,237,2)"|" substr($0,239,2)"|" substr($0,241,8)"|" substr($0,249,2)"|" substr($0,251,12)"|" substr($0,263,40)"|" substr($0,303,40)"|" substr($0,343,40)"|" substr($0,383,40)"|" substr($0,423,40)"|" substr($0,463,40)"|" substr($0,503,40)"|" substr($0,543,40)"|" substr($0,583,40)"|" substr($0,623,40)"|" substr($0,663,40)"|" substr($0,703,40)"|" substr($0,743,40)"|" substr($0,783,40)"|" substr($0,823,40)"|" substr($0,863,5)"|" substr($0,868,1)"|" substr($0,869,1)"|" substr($0,870,2)"|" substr($0,872,32)"|" substr($0,904,16)"|" substr($0,920,16)"|" substr($0,936,2)"|" substr($0,938,2)"|" substr($0,940,2)"|" substr($0,942,12)"|" substr($0,954,4)"|" substr($0,958,8)"|" substr($0,966,2)"|" substr($0,968,6)"|" substr($0,974,3)"|" substr($0,977,10)"|" substr($0,987,4)"|" substr($0,991,10)"|" substr($0,1001,2)"|" substr($0,1003,4)"|" substr($0,1007,40)"|" substr($0,1047,24)"|" substr($0,1071,24)"|" substr($0,1095,24)"|" substr($0,1119,1)"|" substr($0,1120,14)"|" substr($0,1134,2)"|" substr($0,1136,4)"|" substr($0,1140,16)"|" substr($0,1156,14)"|" substr($0,1170,1)"|" substr($0,1171,8)"|" substr($0,1179,8)"|" substr($0,1187,9)"|"  substr($0,1196,20);next}
       else if(substr($0,1,1)=="H"&&substr($0,343,39)=="RUA DR. EDUARDO SANTOS SILVA 261 - FRAC"){print substr($0,1,11)"|" substr($0,12,12)"|" substr($0,24,4)"|"substr($0,28,3)"|" substr($0,31,40)"|" substr($0,71,14)"|" substr($0,85,2)"|" substr($0,87,4)"|" substr($0,91,10)"|" substr($0,101,46)"|" substr($0,147,35)"|" substr($0,182,8)"|" substr($0,190,2)"|" substr($0,192,2)"|" substr($0,194,1)"|" substr($0,195,2)"|" substr($0,197,2)"|" substr($0,199,2)"|" substr($0,201,2)"|" substr($0,203,2)"|" substr($0,205,8)"|" substr($0,213,16)"|" substr($0,229,3)"|" substr($0,232,3)"|" substr($0,235,2)"|" substr($0,237,2)"|" substr($0,239,2)"|" substr($0,241,8)"|" substr($0,249,2)"|" substr($0,251,12)"|" substr($0,263,40)"|" substr($0,303,40)"|" substr($0,343,39)"?|" substr($0,383,40)"|" substr($0,423,40)"|" substr($0,463,40)"|" substr($0,503,40)"|" substr($0,543,40)"|" substr($0,583,40)"|" substr($0,623,40)"|" substr($0,663,40)"|" substr($0,703,40)"|" substr($0,743,40)"|" substr($0,783,40)"|" substr($0,823,40)"|" substr($0,863,5)"|" substr($0,868,1)"|" substr($0,869,1)"|" substr($0,870,2)"|" substr($0,872,32)"|" substr($0,904,16)"|" substr($0,920,16)"|" substr($0,936,2)"|" substr($0,938,2)"|" substr($0,940,2)"|" substr($0,942,12)"|" substr($0,954,4)"|" substr($0,958,8)"|" substr($0,966,2)"|" substr($0,968,6)"|" substr($0,974,3)"|" substr($0,977,10)"|" substr($0,987,4)"|" substr($0,991,10)"|" substr($0,1001,2)"|" substr($0,1003,4)"|" substr($0,1007,40)"|" substr($0,1047,24)"|" substr($0,1071,24)"|" substr($0,1095,24)"|" substr($0,1119,1)"|" substr($0,1120,14)"|" substr($0,1134,2)"|" substr($0,1136,4)"|" substr($0,1140,16)"|" substr($0,1156,14)"|" substr($0,1170,1)"|" substr($0,1171,8)"|" substr($0,1179,8)"|" substr($0,1187,9)"|" substr($0,1196,20);next}
       else if(substr($0,1,1)=="C"){print substr($0,1,13);next}
       else if(substr($0,1,1)=="P"){print substr($0,1,13)"|" substr($0,14,4)"|" substr($0,18,2)"|" substr($0,20,8)"|" substr($0,28,6)"|" substr($0,34,14)"|" substr($0,48,10)"|"substr($0,58,10);next}
       else if(substr($0,1,1)=="D"){print substr($0,1,13)"|" substr($0,14,4)"|" substr($0,18,4)"|" substr($0,22,2)"|" substr($0,24,20)"|" substr($0,44,6)"|" substr($0,50,60)"|" substr($0,110,8)"|" substr($0,118,8)"|" substr($0,126,8)"|" substr($0,134,8)"|" substr($0,142,4)"|" substr($0,146,2)"|" substr($0,148,4)"|" substr($0,152,4)"|" substr($0,156,4)"|" substr($0,160,3)"|" substr($0,163,2)"|" substr($0,165,8)"|" substr($0,173,1)"|" substr($0,174,1)"|" substr($0,175,1)"|" substr($0,176,4)"|" substr($0,180,2)"|" substr($0,182,15)"|" substr($0,197,1)"|" substr($0,198,8)"|" substr($0,206,8)"|" substr($0,214,1)"|" substr($0,215,14)"|" substr($0,229,1)"|" substr($0,230,14)"|" substr($0,244,3)"|" substr($0,247,35)"|" substr($0,282,2)"|" substr($0,284,6)"|" substr($0,290,4)"|" substr($0,294,6)"|" substr($0,300,10);next}
       else if(substr($0,1,1)=="A"){print substr($0,1,13)"|" substr($0,14,4)"|" substr($0,18,4)"|" substr($0,22,8)"|" substr($0,30,2)"|" substr($0,32,8)"|" substr($0,40,8)"|" substr($0,48,4)"|" substr($0,52,8);next}       else if(substr($0,1,1)=="S"){print substr($0,1,13)"|" substr($0,14,2)"|" substr($0,16,8)"|" substr($0,24,8)"|" substr($0,32,10)"|" substr($0,42,10)"|" substr($0,52,14)"|" substr($0,66,8)"|" substr($0,74,10)"|" substr($0,84,8)"|" substr($0,92,4)"|" substr($0,96,4)"|" substr($0,100,1)"|" substr($0,101,18)"|" substr($0,119,8);next}
       else if(substr($0,1,1)=="B"||substr($0,1,1)=="I"){print substr($0,1,11)"|" substr($0,12,6)"|" substr($0,18,6)"|" substr($0,24,12)"|" substr($0,36,4)"|" substr($0,40,13)"|" substr($0,53,10)"|" substr($0,63,8)"|" substr($0,71,15)"|" substr($0,86,13)"|" substr($0,99,10);next}}}
      END{if(substr($0,1,2)=="GA"){close($testfile)}}

In the file one record contain some special character like á ,í part of the data. In HP-Unix after converting the file in pipe delimiter the new file contain all the data with all the character.ex:
|ES 28805 Alcal▒ de Henares |TEL GLOBAL TE S.A. |C/ Gran V▒a 28 | |

But when i am using the same code in linux it ignore the data after special character and all the column became null
ES 28805 Alcal||||||||||||||||||||||||||||||||||||||||||||||

Please advice..

Last edited by fpmurphy; 02-05-2015 at 05:41 AM..
# 2  
Old 02-05-2015
Can you show us some sample input and required output please.
# 3  
Old 02-05-2015
Hi,
Can you show the locale of your HP-UX and your linux ?
# 4  
Old 02-05-2015
Do you have identical locales on both hosts?

On a linux machine, an empty FS is possible, and sth. like
Code:
awk  '{MX=split (FLDS, P, " "); for (i=1; i<=MX; i++) $(P[i])=$(P[i]) "|" } 1' FS="" OFS="" FLDS="7 19 32" file
RUA DR.| EDUARDO SAN|TOS SILVA 261| - FRA

could work?
# 5  
Old 02-06-2015
Hi rbattle,
sample data:
Code:
H721615296070R86593102170  999 OPN ID69                               20141117171817T1ZOR 700016901 4791032964                                    BVOM.ES@kk.COM                     20141117072  00  RE  DP20141031SYI1          074704700000532014111411B600*SAP01  Telefonica                              Avenida Punto Com, 23                                                                                                   ES 28805 Alcalá de Henares              TELEFONICA GLOBAL TECHNOLOGY S.A.       C/ Gran Vía 28                                                                                                          ES 28013 Madrid                         TELEFONICA GLOBAL TECHNOLOGY S.A.       C/ Gran Vía 28                                                                                                          ES 28013 Madrid                         HPT&CE3Z1                                       90592.78       113485.58 ECESDP28805       700020141117ORADIN13   0500701206    91717059  ZZ                                                                                                                     20141117173030BBWAT BBTO            20141117173030 0000000000000000 1.25270

Expected Output:
Code:
H7216152960|70R865931021|70  |999| OPN ID69                               |20141117171817|T1|ZOR |700016901 |4791032964                                    |BVOM.ES@kk.COM                     |20141117|07|2 | |00|  |RE|  |DP|20141031|SYI1          07|470|470|00|00|53|20141114|11|B600*SAP01  |Telefonica                              |Avenida Punto Com, 23                   |                                        |                                        |ES 28805 Alcalá de Henares              |TELEFONICA GLOBAL TECHNOLOGY S.A.       |C/ Gran Vía 28                          |                                        |                                        |ES 28013 Madrid                         |TELEFONICA GLOBAL TECHNOLOGY S.A.       |C/ Gran Vía 28                          |                                        |                                        |ES 28013 Madrid                         |HPT&C|E|3|Z1|                                |       90592.78 |      113485.58 |EC|ES|DP|28805       |7000|20141117|OR|ADIN13|   |0500701206|    |91717059  |ZZ|    |                                        |                        |                        |                        | |20141117173030|BB|WAT |BBTO            |20141117173030| |00000000|00000000| 1.25270 |

Hi rudic,
when i have used your code but in the second line it given me error . so i have execute your code till "file" only, it resolved the issue, output file contain that special character but it not give the expected result extra pipe delimiter coming, Please check the expected output. i have to put the condition as well as i have give in the previous post.

Please find the attachment for locale for both the machine.

Thanks.

Last edited by Franklin52; 02-06-2015 at 09:15 AM.. Reason: Please use code tags
# 6  
Old 02-06-2015
With a subset of field lengths
Code:
echo $FLDS
11 23 26 29 39 53 55 59 69 80 95 103 105 107 108 110 111 113 114 116 124 131 134

extracted from your Expected output, and having corrected your sample data (adding extra spaces that disappeared because you did not use code tags), the result of
Code:
awk  '{MX=split (FLDS, P, " "); for (i=1; i<=MX; i++) $(P[i])=$(P[i]) "|" } 1' FS="" OFS="" FLDS="$FLDS" file > file2

cat file[24] | less
H7216152960|70R865931021|70 |999| OPN ID69 |20141117171817|T1|ZOR |700016901 |4791032964 |BVOM.ES@kk.COM |20141117|07|2 | |00| |RE| |DP|20141031|SYI1 07|470|470|00|00|53|20141114|
H7216152960|70R865931021|70 |999| OPN ID69 |20141117171817|T1|ZOR |700016901 |4791032964 |BVOM.ES@kk.COM |20141117|07|2 | |00| |RE| |DP|20141031|SYI1 07|470|470|00|00|53|20141114|

is pretty close to what you expect (second line)...

I admit there might arise problems with non-ASCII chars as they occupy 2 or more bytes, but I presume it's difficult to create a fixed width file with non-ASCII text.
# 7  
Old 02-06-2015
Hi,
Quote:
Originally Posted by RudiC
I admit there might arise problems with non-ASCII chars as they occupy 2 or more bytes, but I presume it's difficult to create a fixed width file with non-ASCII text.
I'm not sure, example:
Code:
$ echo "Gran Vía" | od -c
0000000   G   r   a   n       V 303 255   a  \n
0000012
$ echo "Gran Vía" | LANG=C awk '{print length($0)}'
9
$ echo "Gran Vía" | LANG=fr_FR.UTF-8 awk '{print length($0)}'
8

Regards.
Login or Register to Ask a Question

Previous Thread | Next Thread

10 More Discussions You Might Find Interesting

1. Shell Programming and Scripting

Listing strings from file using awr Linux Red Hat

Hi experts, I have a file "salida_test" containing (in repetitive way): Point ID 1.750251 Point Name >BRI_4L_SA2__INT Interruptor 33kV Parque Industrial < value 2 Time of last value update (ascii): >03/07/17 11:11:14.596 ART< TLQ 0000000c00004000 station #79 ... (6 Replies)
Discussion started by: carlino70
6 Replies

2. Shell Programming and Scripting

Alter Fixed Width File

Thank u so much .Its working fine as expected. ---------- Post updated at 03:41 PM ---------- Previous update was at 01:46 PM ---------- I need one more help. I have another file(fixed length) that will get negative value (ex:-00000000003000) in postion (98 - 112) then i have to... (6 Replies)
Discussion started by: vinus
6 Replies

3. UNIX for Dummies Questions & Answers

Length of a fixed width file

I have a fixed width file of length 53. when is try to get the lengh of the record of that file i get 2 different answers. awk '{print length;exit}' <File_name> The above code gives me length 50. wc -L <File_name> The above code gives me length 53. Please clarify on... (2 Replies)
Discussion started by: Amrutha24
2 Replies

4. Shell Programming and Scripting

Directory / File changes on CIFS share mounted on Red Hat Linux

I have a requirement to copy the changed file on CIFS share mounted on Red Hat Linux to a remote FTP/SFTP server. I tried inotify-tools, but this didn't track the modified files. Has anyone tried incron or any other suggestion? (1 Reply)
Discussion started by: SupeAlok
1 Replies

5. Homework & Coursework Questions

File transfer from Red Hat Linux to Windows 7

My assignment is to use C++ to generate a table of values for the U.S. standard atmosphere, when data at sea level are given, which i have done perfectly. Now, i am attempting to create a matlab script to read and plot the data. I forgot to put my table of data on my thumb drive yesterday, and... (4 Replies)
Discussion started by: ds7202
4 Replies

6. UNIX for Dummies Questions & Answers

cleaning up spaces from fixed width file while converting to csv file

Open to a sed/awk/or perl alternative so that i can stick command into my bash script. This is a problem I resolve using a combination of cut commands - but that is getting convoluted. So would really appreciate it if someone could provide a better solution which basically replaces all... (3 Replies)
Discussion started by: svn
3 Replies

7. UNIX Desktop Questions & Answers

Help with Fixed width File Parsing

I am trying to parse a Fixed width file with data as below. I am trying to assign column values from each record to variables. When I parse the data, the spaces in all coumns are dropped. I would like to retain the spaces as part of the dat stored in the variables. Any help is appreciated. I... (4 Replies)
Discussion started by: sate911
4 Replies

8. UNIX for Advanced & Expert Users

Converting field into fixed width csv

Hi I have a file having record as - 1,aaa,a123,a I need this converted to as 2nd col to 5 chars wide & 3rd col to 6chars wide such as - 1,aaa ,a123 ,a How we could achieve this? Thx in advance. (1 Reply)
Discussion started by: videsh77
1 Replies

9. Shell Programming and Scripting

Converting a Delimited File to Fixed width file

Hi, I have a delimited file generated by a database and i need to convert it to fixed width file using the field length of the database. Can any body suggest me how can i proceed with it? :confused: Thanks Raghavan (2 Replies)
Discussion started by: raghavan.aero
2 Replies

10. Shell Programming and Scripting

adding delimiter to a fixed width file

Hi , I have a file : CSCH74000.00 CSCH74000.00 CSCH74100.00 CSCH74000.00 CSCH74100.00 CSCH74000.00 CSCH74000.00 CSCH74100.00 CSCH74100.00 CSCH74100.00 I have to put a delimiter( say comma) in between after 6th character: CSCH74,000.00 CSCH74,000.00 CSCH74,100.00 (2 Replies)
Discussion started by: sumeet
2 Replies
Login or Register to Ask a Question