File Format


 
Thread Tools Search this Thread
Top Forums UNIX for Advanced & Expert Users File Format
# 1  
Old 08-01-2008
File Format

Hello,

We have a type of ASCII files that we need to process in UNIX. I have no issues opening them in windows (using notepad etc.)

but when i process it on UNIX i get below: (it has thousands of lines in it)

[root@www root]# file test.out
test.out: ASCII text, with very long lines, with no line terminators
[root@www root]# wc -l test.out
0 test.out
[root@www root]#

Is there any technique to have it properly line terminated.

Pls Advise.

TIA
Prvn
# 2  
Old 08-01-2008
Without detailed information about how it's supposedly line terminated, it's tricky to come up with a suggestion. Regular DOS files have CR/LF line terminators, which count as line terminators in Unix too (seeing as Unix uses just LF, and regards the DOS CR before it as just another control character). Can you display a hex dump of, say, the first few dozen characters, enough to see where the first line break is supposed to be? (Try hexdump test.out | head or xxd test.out | head or od -test.out | head)
# 3  
Old 08-01-2008
Hammer & Screwdriver Guess - fixed record file without record delimiters

I see this often, where the file is fixed record length. You need to find the pattern of data and reformat accordingly.

For instance, the following is three records; each record comprising 3 datafields; each datafield of five characters.

Code:
Joe  unix 12   Prvx ps   905  era  dos  1015

Once you know the data layout - the overall record length - you can begin the process of converting the file into something you might be able to process easier.

One function I often use when confronted by this is to:

Code:
> cat myfile | od -An -t dC -w10 | more

With a handy ASCII chart, you can probably read along the data file to see what is happening.
# 4  
Old 08-01-2008
But the OP said the file wraps fine on DOS. Could be some weird control character (or Unicode line break? Shudder) which is interpreted by Notepad but not by Unix.
# 5  
Old 08-01-2008
Thanks for your replies.

Yes, its being opened in notepad without any problems and then if i copy from notepad and then paste to vi or another editor and then the new file gets normally processed in UNIX. As we need to automate the process, we can not afford this intermediate task.

Please find attached sample file (zipped file - test50.001 has 25 lines) but shows as below:

[root@www]# cat test50.001
PV1|1|E|||||^||||||||||9999^NO PMD|||||||||||||||||||60||||||||0821102053|200829PV1|1|E|||||10359^FRANK RACHEL A||||||||||^|||||||||||||||||||60||||||||08211020PV1|1|E|||||10359^FRANK RACHEL A||||||||||10236^MODESTE KAREN A|||||||||||||||||PR1|16|C4|99282|||||||||||||~07
[root@www]# wc -l test50.001
1 test50.001
[root@www]# file test50.001
test50.001: ASCII text, with CRLF, CR line terminators
[root@www]#


Note: Please directly transfter the attached file to UNIX box and test it (without any COPY/PASTE)

Thanks and Regards,
# 6  
Old 08-02-2008
The attachment is still pending approval. Could you instead just run a hexdump on the first few dozen bytes of a sample as previously requested? Also, using code tags when posting excerpts might improve readability. Thanks.
# 7  
Old 08-02-2008
Thanks era.

Please find below hexdump and od outputs of test50.001 (this file has 25 lines in it)

+++++

[root@www ]# hexdump test50.001
0000000 534d 7c48 7e5e 265c 487c 374c 487c 7c43
0000010 4248 434f 4c7c 327c 3030 3038 3038 3131
0000020 3433 7c36 427c 5241 505e 3130 7c7c 7c50
0000030 2e32 0d33 4950 7c44 7c31 307c 3030 3930
0000040 3236 3235 7c35 527c 5541 204c 4f4d 524e
0000050 594f 4a5e 534f 7c45 7c7c 7c4d 7c7c 7c7c
0000060 7c7c 7c7c 307c 3238 3131 3230 3530 0d33
0000070 5650 7c31 7c31 7c45 7c7c 7c7c 7c5e 7c7c
0000080 7c7c 7c7c 7c7c 397c 3939 5e39 4f4e 5020
0000090 444d 7c7c 7c7c 7c7c 7c7c 7c7c 7c7c 7c7c
00000a0 7c7c 7c7c 367c 7c30 7c7c 7c7c 7c7c 307c
00000b0 3238 3131 3230 3530 7c33 3032 3830 3932
00000c0 3730 440d 3147 307c 497c 7c39 3837 2e36
00000d0 3035 440d 3147 317c 497c 7c39 3239 2e32
00000e0 0d31 4744 7c31 7c32 3949 347c 3339 392e
00000f0 0d30 4744 7c31 7c33 3949 457c 3139 2e37
0000100 0d39 4744 7c31 7c34 3949 457c 3438 2e39
0000110 0d33 5250 7c31 7c31 3949 387c 2e37 3334
0000120 7c7c 3032 3830 3730 3932 500d 3152 327c
0000130 497c 7c39 3738 342e 7c39 327c 3030 3038
0000140 3237 0d39 5250 7c31 3631 437c 7c34 3939
0000150 3832 7c32 7c7c 7c7c 7c7c 7c7c 7c7c 7c7c
0000160 0d7e 534d 7c48 7e5e 265c 487c 374c 487c
0000170 7c43 4248 434f 4c7c 327c 3030 3038 3038
0000180 3131 3433 7c36 427c 5241 505e 3130 7c7c
0000190 7c50 2e32 0d33 4950 7c44 7c31 307c 3030
00001a0 3031 3431 3730 7c39 4d7c 4c45 5349 2041
00001b0 5547 4c41 504c 5e41 5242 4e45 4144 7c7c
00001c0 467c 7c7c 7c7c 7c7c 7c7c 7c7c 3830 3132
00001d0 3031 3032 3636 500d 3156 317c 457c 7c7c
00001e0 7c7c 317c 3330 3935 465e 4152 4b4e 5220
00001f0 4341 4548 204c 7c41 7c7c 7c7c 7c7c 7c7c
0000200 5e7c 7c7c 7c7c 7c7c 7c7c 7c7c 7c7c 7c7c
0000210 7c7c 7c7c 367c 7c30 7c7c 7c7c 7c7c 307c
0000220 3238 3131 3230 3630 7c36 3032 3830 3932
0000230 3730 440d 3147 307c 497c 7c39 3833 2e38
0000240 3037 440d 3147 317c 497c 7c39 3833 2e32
0000250 0d39 5250 7c31 3631 437c 7c34 3939 3832
0000260 7c32 7c7c 7c7c 7c7c 7c7c 7c7c 7c7c 0d7e
0000270 534d 7c48 7e5e 265c 487c 374c 487c 7c43
0000280 4248 434f 4c7c 327c 3030 3038 3038 3131
0000290 3433 7c36 427c 5241 505e 3130 7c7c 7c50
00002a0 2e32 0d33 4950 7c44 7c31 307c 3030 3930
00002b0 3833 3932 7c32 4a7c 4e41 4349 2045 554c
00002c0 4f47 4a5e 4c55 4149 7c4e 7c7c 7c4d 7c7c
00002d0 7c7c 7c7c 7c7c 307c 3238 3131 3230 3630
00002e0 0d37 5650 7c31 7c31 7c45 7c7c 7c7c 3031
00002f0 3533 5e39 5246 4e41 204b 4152 4843 4c45
0000300 4120 7c7c 7c7c 7c7c 7c7c 7c7c 3031 3332
0000310 5e36 4f4d 4544 5453 2045 414b 4552 204e
0000320 7c41 7c7c 7c7c 7c7c 7c7c 7c7c 7c7c 7c7c
0000330 7c7c 7c7c 3036 7c7c 7c7c 7c7c 7c7c 3830
0000340 3132 3031 3032 3736 327c 3030 3238 3039
0000350 0d37 4744 7c31 7c30 3949 377c 3038 362e
0000360 440d 3147 317c 497c 7c39 3634 0d32 4744
0000370 7c31 7c32 3949 347c 3339 302e 0d30 4744
0000380 7c31 7c33 3949 327c 3137 332e 500d 3152
0000390 317c 7c36 3443 397c 3239 3238 7c7c 7c7c
00003a0 7c7c 7c7c 7c7c 7c7c 7e7c 0a0d
00003ac
[root@www ]# od test50.001
0000000 051515 076110 077136 023134 044174 033514 044174 076103
0000020 041110 041517 046174 031174 030060 030070 030070 030461
0000040 032063 076066 041174 051101 050136 030460 076174 076120
0000060 027062 006463 044520 076104 076061 030174 030060 034460
0000100 031066 031065 076065 051174 052501 020114 047515 051116
0000120 054517 045136 051517 076105 076174 076115 076174 076174
0000140 076174 076174 030174 031070 030461 031060 032460 006463
0000160 053120 076061 076061 076105 076174 076174 076136 076174
0000200 076174 076174 076174 034574 034471 057071 047516 050040
0000220 042115 076174 076174 076174 076174 076174 076174 076174
0000240 076174 076174 033174 076060 076174 076174 076174 030174
0000260 031070 030461 031060 032460 076063 030062 034060 034462
0000300 033460 042015 030507 030174 044574 076071 034067 027066
0000320 030065 042015 030507 030574 044574 076071 031071 027062
0000340 006461 043504 076061 076062 034511 032174 031471 034456
0000360 006460 043504 076061 076063 034511 042574 030471 027067
0000400 006471 043504 076061 076064 034511 042574 032070 027071
0000420 006463 051120 076061 076061 034511 034174 027067 031464
0000440 076174 030062 034060 033460 034462 050015 030522 031174
0000460 044574 076071 033470 032056 076071 031174 030060 030070
0000500 031067 006471 051120 076061 033061 041574 076064 034471
0000520 034062 076062 076174 076174 076174 076174 076174 076174
0000540 006576 051515 076110 077136 023134 044174 033514 044174
0000560 076103 041110 041517 046174 031174 030060 030070 030070
0000600 030461 032063 076066 041174 051101 050136 030460 076174
0000620 076120 027062 006463 044520 076104 076061 030174 030060
0000640 030061 032061 033460 076071 046574 046105 051511 020101
0000660 052507 046101 050114 057101 051102 047105 040504 076174
0000700 043174 076174 076174 076174 076174 076174 034060 030462
0000720 030061 030062 033066 050015 030526 030574 042574 076174
0000740 076174 030574 031460 034465 043136 040522 045516 051040
0000760 041501 042510 020114 076101 076174 076174 076174 076174
0001000 057174 076174 076174 076174 076174 076174 076174 076174
0001020 076174 076174 033174 076060 076174 076174 076174 030174
0001040 031070 030461 031060 033060 076066 030062 034060 034462
0001060 033460 042015 030507 030174 044574 076071 034063 027070
0001100 030067 042015 030507 030574 044574 076071 034063 027062
0001120 006471 051120 076061 033061 041574 076064 034471 034062
0001140 076062 076174 076174 076174 076174 076174 076174 006576
0001160 051515 076110 077136 023134 044174 033514 044174 076103
0001200 041110 041517 046174 031174 030060 030070 030070 030461
0001220 032063 076066 041174 051101 050136 030460 076174 076120
0001240 027062 006463 044520 076104 076061 030174 030060 034460
0001260 034063 034462 076062 045174 047101 041511 020105 052514
0001300 047507 045136 046125 040511 076116 076174 076115 076174
0001320 076174 076174 076174 030174 031070 030461 031060 033060
0001340 006467 053120 076061 076061 076105 076174 076174 030061
0001360 032463 057071 051106 047101 020113 040522 044103 046105
0001400 040440 076174 076174 076174 076174 076174 030061 031462
0001420 057066 047515 042504 052123 020105 040513 042522 020116
0001440 076101 076174 076174 076174 076174 076174 076174 076174
0001460 076174 076174 030066 076174 076174 076174 076174 034060
0001500 030462 030061 030062 033466 031174 030060 031070 030071
0001520 006467 043504 076061 076060 034511 033574 030070 033056
0001540 042015 030507 030574 044574 076071 033064 006462 043504
0001560 076061 076062 034511 032174 031471 030056 006460 043504
0001600 076061 076063 034511 031174 030467 031456 050015 030522
0001620 030574 076066 032103 034574 031071 031070 076174 076174
0001640 076174 076174 076174 076174 077174 005015
0001654
[root@www ]#


+++++++
Login or Register to Ask a Question

Previous Thread | Next Thread

10 More Discussions You Might Find Interesting

1. Shell Programming and Scripting

Script to generate Excel file or to SQL output data to Excel format/tabular format

Hi , i am generating some data by firing sql query with connecting to the database by my solaris box. The below one should be the header line of my excel ,here its coming in separate row. TO_CHAR(C. CURR_EMP_NO ---------- --------------- LST_NM... (6 Replies)
Discussion started by: dani1234
6 Replies

2. Shell Programming and Scripting

Need help to format one txt file to required format

Hello Everyone, I have one source file which is genarated by SAP in different format(Which I've never seen). I need to convert that file to required format and I need to read this target file from Datastage to use this in my Jobs. So I do not have any other options except to use Unix script to... (4 Replies)
Discussion started by: Prathyu
4 Replies

3. Shell Programming and Scripting

Converting windows format file to unix format using script

Hi, I am having couple of files which i used to copy from windows to Linux, so now in case of text files (CTRL^M) appears at end of line. I know i can convert this windows format file to unix format file by running dos2unix. My requirement here is that i want to do it automatically using a... (5 Replies)
Discussion started by: sarbjit
5 Replies

4. Shell Programming and Scripting

Convert UNIX file format to PC format

Hi All, Is there any way to convert a file which is in UNIX format to a PC format.... Flip command can be used , apart form this command can we have any other way.... like usinf "awk" etc ..... main purpose of not using flip is that my Kshell doesnot support this comamnd.... (2 Replies)
Discussion started by: Samtel
2 Replies

5. UNIX for Dummies Questions & Answers

Convert UNIX file format to PC format

Hi All, Is there any way to convert a file which is in UNIX format to a PC format.... Flip command can be used , apart form this command can we have any other way.... like usinf "awk" etc ..... main purpose of not using flip is that my Kshell doesnot support this comamnd.... (1 Reply)
Discussion started by: Samtel
1 Replies

6. Shell Programming and Scripting

Convert Epoch time format to normal date time format in the same file

I have a file named "suspected" with series of line like these : {'protocol': 17, 'service': 'BitTorrent KRPC', 'server': '219.78.120.166', 'client_port': 52044, 'client': '10.64.68.44', 'server_port': 8291, 'time': 1226506312L, 'serverhostname': ''} {'protocol': 17, 'service': 'BitTorrent... (3 Replies)
Discussion started by: rk4k
3 Replies

7. Shell Programming and Scripting

AWK CSV to TXT format, TXT file not in a correct column format

HI guys, I have created a script to read 1 column in a csv file and then place it in text file. However, when i checked out the text file, it is not in a column format... Example: CSV file contains name,age aa,11 bb,22 cc,33 After using awk to get first column TXT file... (1 Reply)
Discussion started by: mdap
1 Replies

8. UNIX for Dummies Questions & Answers

To convert multi format file to a readable ascii format

Hi I have a file which has ascii , binary, binary decimal coded,decimal & hexadecimal data with lot of special characters (like öƒ.ƒ.„İİ¡Š·œƒ.„İİ¡Š· ) in it. I want to standardize the file into ASCII format & later use that as source . Can any one suggest a way a logic to convert such... (5 Replies)
Discussion started by: gaur.deepti
5 Replies

9. UNIX for Dummies Questions & Answers

Convert UTF8 Format file to ANSI format

:confused: Hi i am trying to convert a file which is in UTF8 format to ANSI format i tried to use the function ICONV but it is throwing error Function i used it as $ iconv -f UTF8 -t ANSI filename Error iam getting is NOT Supported UTF8 to ANSI please some help me out on... (9 Replies)
Discussion started by: rajreddy
9 Replies

10. UNIX for Advanced & Expert Users

Convert UTF8 Format file to ANSI format

:) Hi i am trying to convert a file which is in UTF8 format to ANSI format i tried to use the function ICONV but it is throwing error Function i used it as $ iconv -f UTF8 -t ANSI filename Error iam getting is NOT Supported UTF8 to ANSI please some help me out on this.........Let me... (1 Reply)
Discussion started by: rajreddy
1 Replies
Login or Register to Ask a Question