The UNIX and Linux Forums  
Hello and Welcome from United States to the UNIX and Linux Forums! Thank You for Visiting and Joining Our Global Community.

Go Back   The UNIX and Linux Forums > Top Forums > UNIX for Advanced & Expert Users
.
google unix.com



UNIX for Advanced & Expert Users Expert-to-Expert. Learn advanced UNIX, UNIX commands, Linux, Operating Systems, System Administration, Programming, Shell, Shell Scripts, Solaris, Linux, HP-UX, AIX, OS X, BSD.

More UNIX and Linux Forum Topics You Might Find Helpful
Thread Thread Starter Forum Replies Last Post
To convert multi format file to a readable ascii format gaur.deepti UNIX for Dummies Questions & Answers 5 03-25-2008 03:03 PM
File Format gopsman Shell Programming and Scripting 2 12-03-2007 08:06 AM
Convert UTF8 Format file to ANSI format rajreddy UNIX for Dummies Questions & Answers 9 05-25-2007 08:26 AM
Convert UTF8 Format file to ANSI format rajreddy UNIX for Advanced & Expert Users 1 05-24-2007 06:40 AM
File Format KramNation UNIX for Dummies Questions & Answers 1 02-02-2006 02:59 PM

Closed Thread
English Japanese Spanish French German Portuguese Italian Dutch Swedish Russian Norwegian Hungarian Hebrew Danish
 
LinkBack Thread Tools Search this Thread Rate Thread Display Modes
  #1 (permalink)  
Old 08-01-2008
prvnrk prvnrk is offline
Registered User
  
 

Join Date: Jul 2007
Posts: 135
File Format

Hello,

We have a type of ASCII files that we need to process in UNIX. I have no issues opening them in windows (using notepad etc.)

but when i process it on UNIX i get below: (it has thousands of lines in it)

[root@www root]# file test.out
test.out: ASCII text, with very long lines, with no line terminators
[root@www root]# wc -l test.out
0 test.out
[root@www root]#

Is there any technique to have it properly line terminated.

Pls Advise.

TIA
Prvn
  #2 (permalink)  
Old 08-01-2008
era era is offline Forum Advisor  
Herder of Useless Cats (On Sabbatical)
  
 

Join Date: Mar 2008
Location: /there/is/only/bin/sh
Posts: 3,652
Without detailed information about how it's supposedly line terminated, it's tricky to come up with a suggestion. Regular DOS files have CR/LF line terminators, which count as line terminators in Unix too (seeing as Unix uses just LF, and regards the DOS CR before it as just another control character). Can you display a hex dump of, say, the first few dozen characters, enough to see where the first line break is supposed to be? (Try hexdump test.out | head or xxd test.out | head or od -test.out | head)
  #3 (permalink)  
Old 08-01-2008
joeyg's Avatar
joeyg joeyg is offline Forum Staff  
modérateur
  
 

Join Date: Dec 2007
Location: Home of 17-time world champion Boston Celtics
Posts: 1,311
Wink Guess - fixed record file without record delimiters

I see this often, where the file is fixed record length. You need to find the pattern of data and reformat accordingly.

For instance, the following is three records; each record comprising 3 datafields; each datafield of five characters.

Code:
Joe  unix 12   Prvx ps   905  era  dos  1015
Once you know the data layout - the overall record length - you can begin the process of converting the file into something you might be able to process easier.

One function I often use when confronted by this is to:

Code:
> cat myfile | od -An -t dC -w10 | more
With a handy ASCII chart, you can probably read along the data file to see what is happening.
  #4 (permalink)  
Old 08-01-2008
era era is offline Forum Advisor  
Herder of Useless Cats (On Sabbatical)
  
 

Join Date: Mar 2008
Location: /there/is/only/bin/sh
Posts: 3,652
But the OP said the file wraps fine on DOS. Could be some weird control character (or Unicode line break? Shudder) which is interpreted by Notepad but not by Unix.
  #5 (permalink)  
Old 08-01-2008
prvnrk prvnrk is offline
Registered User
  
 

Join Date: Jul 2007
Posts: 135
Thanks for your replies.

Yes, its being opened in notepad without any problems and then if i copy from notepad and then paste to vi or another editor and then the new file gets normally processed in UNIX. As we need to automate the process, we can not afford this intermediate task.

Please find attached sample file (zipped file - test50.001 has 25 lines) but shows as below:

[root@www]# cat test50.001
PV1|1|E|||||^||||||||||9999^NO PMD|||||||||||||||||||60||||||||0821102053|200829PV1|1|E|||||10359^FRANK RACHEL A||||||||||^|||||||||||||||||||60||||||||08211020PV1|1|E|||||10359^FRANK RACHEL A||||||||||10236^MODESTE KAREN A|||||||||||||||||PR1|16|C4|99282|||||||||||||~07
[root@www]# wc -l test50.001
1 test50.001
[root@www]# file test50.001
test50.001: ASCII text, with CRLF, CR line terminators
[root@www]#


Note: Please directly transfter the attached file to UNIX box and test it (without any COPY/PASTE)

Thanks and Regards,
Attached Files
File Type: zip test50.zip (452 Bytes, 1 views)
  #6 (permalink)  
Old 08-02-2008
era era is offline Forum Advisor  
Herder of Useless Cats (On Sabbatical)
  
 

Join Date: Mar 2008
Location: /there/is/only/bin/sh
Posts: 3,652
The attachment is still pending approval. Could you instead just run a hexdump on the first few dozen bytes of a sample as previously requested? Also, using code tags when posting excerpts might improve readability. Thanks.
  #7 (permalink)  
Old 08-02-2008
prvnrk prvnrk is offline
Registered User
  
 

Join Date: Jul 2007
Posts: 135
Thanks era.

Please find below hexdump and od outputs of test50.001 (this file has 25 lines in it)

+++++

[root@www ]# hexdump test50.001
0000000 534d 7c48 7e5e 265c 487c 374c 487c 7c43
0000010 4248 434f 4c7c 327c 3030 3038 3038 3131
0000020 3433 7c36 427c 5241 505e 3130 7c7c 7c50
0000030 2e32 0d33 4950 7c44 7c31 307c 3030 3930
0000040 3236 3235 7c35 527c 5541 204c 4f4d 524e
0000050 594f 4a5e 534f 7c45 7c7c 7c4d 7c7c 7c7c
0000060 7c7c 7c7c 307c 3238 3131 3230 3530 0d33
0000070 5650 7c31 7c31 7c45 7c7c 7c7c 7c5e 7c7c
0000080 7c7c 7c7c 7c7c 397c 3939 5e39 4f4e 5020
0000090 444d 7c7c 7c7c 7c7c 7c7c 7c7c 7c7c 7c7c
00000a0 7c7c 7c7c 367c 7c30 7c7c 7c7c 7c7c 307c
00000b0 3238 3131 3230 3530 7c33 3032 3830 3932
00000c0 3730 440d 3147 307c 497c 7c39 3837 2e36
00000d0 3035 440d 3147 317c 497c 7c39 3239 2e32
00000e0 0d31 4744 7c31 7c32 3949 347c 3339 392e
00000f0 0d30 4744 7c31 7c33 3949 457c 3139 2e37
0000100 0d39 4744 7c31 7c34 3949 457c 3438 2e39
0000110 0d33 5250 7c31 7c31 3949 387c 2e37 3334
0000120 7c7c 3032 3830 3730 3932 500d 3152 327c
0000130 497c 7c39 3738 342e 7c39 327c 3030 3038
0000140 3237 0d39 5250 7c31 3631 437c 7c34 3939
0000150 3832 7c32 7c7c 7c7c 7c7c 7c7c 7c7c 7c7c
0000160 0d7e 534d 7c48 7e5e 265c 487c 374c 487c
0000170 7c43 4248 434f 4c7c 327c 3030 3038 3038
0000180 3131 3433 7c36 427c 5241 505e 3130 7c7c
0000190 7c50 2e32 0d33 4950 7c44 7c31 307c 3030
00001a0 3031 3431 3730 7c39 4d7c 4c45 5349 2041
00001b0 5547 4c41 504c 5e41 5242 4e45 4144 7c7c
00001c0 467c 7c7c 7c7c 7c7c 7c7c 7c7c 3830 3132
00001d0 3031 3032 3636 500d 3156 317c 457c 7c7c
00001e0 7c7c 317c 3330 3935 465e 4152 4b4e 5220
00001f0 4341 4548 204c 7c41 7c7c 7c7c 7c7c 7c7c
0000200 5e7c 7c7c 7c7c 7c7c 7c7c 7c7c 7c7c 7c7c
0000210 7c7c 7c7c 367c 7c30 7c7c 7c7c 7c7c 307c
0000220 3238 3131 3230 3630 7c36 3032 3830 3932
0000230 3730 440d 3147 307c 497c 7c39 3833 2e38
0000240 3037 440d 3147 317c 497c 7c39 3833 2e32
0000250 0d39 5250 7c31 3631 437c 7c34 3939 3832
0000260 7c32 7c7c 7c7c 7c7c 7c7c 7c7c 7c7c 0d7e
0000270 534d 7c48 7e5e 265c 487c 374c 487c 7c43
0000280 4248 434f 4c7c 327c 3030 3038 3038 3131
0000290 3433 7c36 427c 5241 505e 3130 7c7c 7c50
00002a0 2e32 0d33 4950 7c44 7c31 307c 3030 3930
00002b0 3833 3932 7c32 4a7c 4e41 4349 2045 554c
00002c0 4f47 4a5e 4c55 4149 7c4e 7c7c 7c4d 7c7c
00002d0 7c7c 7c7c 7c7c 307c 3238 3131 3230 3630
00002e0 0d37 5650 7c31 7c31 7c45 7c7c 7c7c 3031
00002f0 3533 5e39 5246 4e41 204b 4152 4843 4c45
0000300 4120 7c7c 7c7c 7c7c 7c7c 7c7c 3031 3332
0000310 5e36 4f4d 4544 5453 2045 414b 4552 204e
0000320 7c41 7c7c 7c7c 7c7c 7c7c 7c7c 7c7c 7c7c
0000330 7c7c 7c7c 3036 7c7c 7c7c 7c7c 7c7c 3830
0000340 3132 3031 3032 3736 327c 3030 3238 3039
0000350 0d37 4744 7c31 7c30 3949 377c 3038 362e
0000360 440d 3147 317c 497c 7c39 3634 0d32 4744
0000370 7c31 7c32 3949 347c 3339 302e 0d30 4744
0000380 7c31 7c33 3949 327c 3137 332e 500d 3152
0000390 317c 7c36 3443 397c 3239 3238 7c7c 7c7c
00003a0 7c7c 7c7c 7c7c 7c7c 7e7c 0a0d
00003ac
[root@www ]# od test50.001
0000000 051515 076110 077136 023134 044174 033514 044174 076103
0000020 041110 041517 046174 031174 030060 030070 030070 030461
0000040 032063 076066 041174 051101 050136 030460 076174 076120
0000060 027062 006463 044520 076104 076061 030174 030060 034460
0000100 031066 031065 076065 051174 052501 020114 047515 051116
0000120 054517 045136 051517 076105 076174 076115 076174 076174
0000140 076174 076174 030174 031070 030461 031060 032460 006463
0000160 053120 076061 076061 076105 076174 076174 076136 076174
0000200 076174 076174 076174 034574 034471 057071 047516 050040
0000220 042115 076174 076174 076174 076174 076174 076174 076174
0000240 076174 076174 033174 076060 076174 076174 076174 030174
0000260 031070 030461 031060 032460 076063 030062 034060 034462
0000300 033460 042015 030507 030174 044574 076071 034067 027066
0000320 030065 042015 030507 030574 044574 076071 031071 027062
0000340 006461 043504 076061 076062 034511 032174 031471 034456
0000360 006460 043504 076061 076063 034511 042574 030471 027067
0000400 006471 043504 076061 076064 034511 042574 032070 027071
0000420 006463 051120 076061 076061 034511 034174 027067 031464
0000440 076174 030062 034060 033460 034462 050015 030522 031174
0000460 044574 076071 033470 032056 076071 031174 030060 030070
0000500 031067 006471 051120 076061 033061 041574 076064 034471
0000520 034062 076062 076174 076174 076174 076174 076174 076174
0000540 006576 051515 076110 077136 023134 044174 033514 044174
0000560 076103 041110 041517 046174 031174 030060 030070 030070
0000600 030461 032063 076066 041174 051101 050136 030460 076174
0000620 076120 027062 006463 044520 076104 076061 030174 030060
0000640 030061 032061 033460 076071 046574 046105 051511 020101
0000660 052507 046101 050114 057101 051102 047105 040504 076174
0000700 043174 076174 076174 076174 076174 076174 034060 030462
0000720 030061 030062 033066 050015 030526 030574 042574 076174
0000740 076174 030574 031460 034465 043136 040522 045516 051040
0000760 041501 042510 020114 076101 076174 076174 076174 076174
0001000 057174 076174 076174 076174 076174 076174 076174 076174
0001020 076174 076174 033174 076060 076174 076174 076174 030174
0001040 031070 030461 031060 033060 076066 030062 034060 034462
0001060 033460 042015 030507 030174 044574 076071 034063 027070
0001100 030067 042015 030507 030574 044574 076071 034063 027062
0001120 006471 051120 076061 033061 041574 076064 034471 034062
0001140 076062 076174 076174 076174 076174 076174 076174 006576
0001160 051515 076110 077136 023134 044174 033514 044174 076103
0001200 041110 041517 046174 031174 030060 030070 030070 030461
0001220 032063 076066 041174 051101 050136 030460 076174 076120
0001240 027062 006463 044520 076104 076061 030174 030060 034460
0001260 034063 034462 076062 045174 047101 041511 020105 052514
0001300 047507 045136 046125 040511 076116 076174 076115 076174
0001320 076174 076174 076174 030174 031070 030461 031060 033060
0001340 006467 053120 076061 076061 076105 076174 076174 030061
0001360 032463 057071 051106 047101 020113 040522 044103 046105
0001400 040440 076174 076174 076174 076174 076174 030061 031462
0001420 057066 047515 042504 052123 020105 040513 042522 020116
0001440 076101 076174 076174 076174 076174 076174 076174 076174
0001460 076174 076174 030066 076174 076174 076174 076174 034060
0001500 030462 030061 030062 033466 031174 030060 031070 030071
0001520 006467 043504 076061 076060 034511 033574 030070 033056
0001540 042015 030507 030574 044574 076071 033064 006462 043504
0001560 076061 076062 034511 032174 031471 030056 006460 043504
0001600 076061 076063 034511 031174 030467 031456 050015 030522
0001620 030574 076066 032103 034574 031071 031070 076174 076174
0001640 076174 076174 076174 076174 077174 005015
0001654
[root@www ]#


+++++++
Sponsored Links
Closed Thread

Bookmarks

Thread Tools Search this Thread
Search this Thread:

Advanced Search
Display Modes Rate This Thread
Rate This Thread:

Posting Rules
You may not post new threads
You may not post replies
You may not post attachments
You may not edit your posts

BB code is On
Smilies are On
[IMG] code is On
HTML code is Off
Trackbacks are On
Pingbacks are On
Refbacks are On



All times are GMT -4. The time now is 02:17 PM.


Powered by: vBulletin, Copyright ©2000 - 2006, Jelsoft Enterprises Limited. Language translation by Google.
vBCredits v1.4 Copyright ©2007 - 2008, PixelFX Studios
The UNIX and Linux Forums Content Copyright ©1993-2009. All Rights Reserved.Ad Management by RedTyger

Content Relevant URLs by vBSEO 3.2.0