The UNIX and Linux Forums  
Hello and Welcome from United States to the UNIX and Linux Forums! Thank You for Visiting and Joining Our Global Community.

Go Back   The UNIX and Linux Forums > Top Forums > UNIX for Dummies Questions & Answers
.
google unix.com



UNIX for Dummies Questions & Answers If you're not sure where to post a UNIX or Linux question, post it here. All UNIX and Linux newbies welcome !!

More UNIX and Linux Forum Topics You Might Find Helpful
Thread Thread Starter Forum Replies Last Post
Perl split question ade214 Shell Programming and Scripting 5 10-14-2008 02:16 AM
Split large file and add header and footer to each file ashish4422 Shell Programming and Scripting 1 04-15-2008 06:12 AM
Split a file with no pattern -- Split, Csplit, Awk madhunk UNIX for Dummies Questions & Answers 10 12-17-2007 12:57 PM
Split and recombine question white_raven0 UNIX for Dummies Questions & Answers 1 06-06-2007 11:42 PM
split question perl reggiej Shell Programming and Scripting 7 07-21-2006 04:18 PM

Closed Thread
English Japanese Spanish French German Portuguese Italian Dutch Swedish Russian Norwegian Hungarian Hebrew Danish Powered by Powered by Google
 
LinkBack Thread Tools Search this Thread Rate Thread Display Modes
  #1 (permalink)  
Old 02-19-2009
techsavvy007 techsavvy007 is offline
Registered User
  
 

Join Date: Jan 2009
Posts: 2
Smile File split question

I have a flat file in UNIX and I have to perform two tasks based on the below data. The data I have printed here is just sample the original data is too long.

The position 110 to 111 (two digit value I have bolded the values) theygives the record type detail in the sample above the record types in the sample are 32,32,31,31 and 35. The real data contains thousands of more records and there are more than 100 record types in a file. I have to split the file based on the record types in position 110 THRU 112.

000000008101 000011000700000000000000000000000001234567454002000 832I20090109 1234567097009967
123450007101 000000000000000000007856343446560000007856454540000 832I20090109 9864536670002456
957645465778 000011000700000000000000000000000067645333567743355 831I20090109 7854536670005647
676767497101 000011000700000000000000000000000008898675335767676 831I20090109 4565767665545469
767865444567 000011000700000000000000000000000007876564454676877 835I20090109 8786756656677887

TASK1: I have to split the file based on the record types. So the out put in this case will be three files

File1 RecordType32
000000008101 000011000700000000000000000000000001234567454002000 832I20090109 1234567097009967
123450007101 000000000000000000007856343446560000007856454540000 832I20090109 9864536670002456

File2 RecordType31
957645465778 000011000700000000000000000000000067645333567743355 831I20090109 7854536670005647
676767497101 000011000700000000000000000000000008898675335767676 831I20090109 4565767665545469

File3 RecordType35
767865444567 000011000700000000000000000000000007876564454676877 835I20090109 8786756656677887

Can any body help me with a solution for this? I am not good at UNIX shell scripting


TASK2: I need to get a unique list of record types in a file in my sample the result should be
32
31
35
  #2 (permalink)  
Old 03-05-2009
otheus's Avatar
otheus otheus is offline Forum Staff  
Moderator ala Mode
  
 

Join Date: Feb 2007
Location: Innsbruck, Austria
Posts: 1,884
Man this is pretty easy. Surprised no one followed up:
Code:
awk 'length($0) > 111 { type=substr($0,110,2); ofile="type-" type ".dat"; print $0 > file; } 
       length($0) <= 111 { print $0 >"type-short.dat" }'

Last edited by otheus; 03-09-2009 at 04:39 AM.. Reason: corrected per zTodd
  #3 (permalink)  
Old 03-08-2009
techsavvy007 techsavvy007 is offline
Registered User
  
 

Join Date: Jan 2009
Posts: 2
Smile

Hi otheus,
Thank you for your reply. I appreciate it. I will try this at work tomorrow as I do not have access fro home.

Have a nice day
  #4 (permalink)  
Old 03-08-2009
Franklin52 Franklin52 is offline Forum Staff  
Moderator
  
 

Join Date: Feb 2007
Posts: 4,293
Assuming the record types are in the 67th position as in your example and not in position 110, this should be sufficient:

Code:
awk '{print > "RecordType" substr($0,67,2)}' file
Regards
  #5 (permalink)  
Old 03-09-2009
zTodd zTodd is offline
Registered User
  
 

Join Date: Mar 2009
Posts: 23
Quote:
Originally Posted by otheus View Post
Man this is pretty easy. Surprised no one followed up:
Code:
awk 'length($0) > 111 { type=substr($0,110,2); ofile="type-" type ".dat"; print $0 > test; } 
       length($0) <= 111 { print $0 >"type-short.dat" }'
Is the part I highlighted in red a typo? Was it supposed to be ofile instead of type? I didn't test it- just seemed so...

Last edited by otheus; 03-09-2009 at 04:39 AM.. Reason: oops!
  #6 (permalink)  
Old 03-09-2009
zTodd zTodd is offline
Registered User
  
 

Join Date: Mar 2009
Posts: 23
For task 2- I believe you can use the sed command or awk command, piped to the sort command with the -u option. Google search for "sed" and for "unix sort" should probably turn up lots of good info for you to learn.
  #7 (permalink)  
Old 03-09-2009
ripat ripat is offline Forum Advisor  
Registered User
  
 

Join Date: Oct 2006
Location: Belgium
Posts: 438
Quote:
Originally Posted by zTodd View Post
For task 2- I believe you can use the sed command or awk command, piped to the sort command with the -u option. Google search for "sed" and for "unix sort" should probably turn up lots of good info for you to learn.
Task 1 and 2 in one go:
Code:
awk '{_types[substr($3,2,2)]; print > "/tmp/type"substr($3,2,2)} END {for (i in _types) print i }' file  > /tmp/type.list
The /tmp/type.list file is not sorted. If you need to sort:
Code:
awk '{_types[substr($3,2,2)]; print > "/tmp/type"substr($3,2,2)} END {for (i in _types) print i}' file | sort >  /tmp/type.list
If your version of awk supports the asorti() function you could also use it but I don't recommend its use as it is rather cumbersome. (In fact it is the function I hate the most in gawk!)
Closed Thread

Bookmarks

Thread Tools Search this Thread
Search this Thread:

Advanced Search
Display Modes Rate This Thread
Rate This Thread:

Posting Rules
You may not post new threads
You may not post replies
You may not post attachments
You may not edit your posts

BB code is On
Smilies are On
[IMG] code is On
HTML code is Off
Trackbacks are On
Pingbacks are On
Refbacks are On




All times are GMT -4. The time now is 01:28 PM.


Powered by: vBulletin, Copyright ©2000 - 2006, Jelsoft Enterprises Limited. Language Translations Powered by .
vBCredits v1.4 Copyright ©2007 - 2008, PixelFX Studios
The UNIX and Linux Forums Content Copyright ©1993-2009. All Rights Reserved.Ad Management by RedTyger

Content Relevant URLs by vBSEO 3.2.0