Splitting the file based on logic


 
Thread Tools Search this Thread
Top Forums Shell Programming and Scripting Splitting the file based on logic
# 8  
Old 03-27-2009
Sorry, I don't have access to an AIX machine right now ...
You could try this, if you wish:
Code:
awk '/.H... / && _++ { 
  fn && close(fn)
  fn = "part_" ++c; _ = 0 
  }
{ print > fn }' infile

Is there a New awk (nawk) on AIX?

What shell you're using?
# 9  
Old 03-27-2009
Hello

I am sorry its working fine when i used nawk. Thanks for your help. Can you explain me the code.
because In future i may have to split the input file for every 50 invoices.

Thanks
# 10  
Old 03-27-2009
Hello radoulov,

Thanks for your help. When i used the script on my real data it didnt work accordingly.

My real data looks like below. I have pasted the sample data for 3 invoices. due to space issues i had to delete some information on every row. Each invoice can be identified by invoice's first line's 63 rd character is "H" and 67 th character is " " (space).

below is the sample data for 3 invoices. In reality i need to split the input file to 50 invoices for each file. in this case i want the input file to split with 2 invoices for each file.


000020997259 20090209130904H000 000 000 000CAHIH
000020997259 20090209130904H000A000 00 000020997259 20090209130904H000E000 000 000CA
000020997259 20090209130904H000N001 000 000CAINVHOAD
000020997259 20090209130904H000N002 000 000CAIHVDOCUM
000020997259 20090209130904H000N003 000 000CHINVFOR
000020997259 20090209130904H000T004 000 000CH090307
000020997259 20090209130904H000U001 000 000CATHACK-
000020997259 20090209130904H000U002 000 000CATRACK-
000020997259 20090209130904H000U003 000 000CATHACK-
000020997259 20090209130904H000U004 000 000CATHACK-
000020997259 20090209130904L001 000 000 000CA000000
000020997259 20090209130904L001M001 000 000CAADL
000020997259 20090209130904L001M002 000 000CATHRCHG
000020997259 20090209130904L002 000 000 000CA000000
000020997259 20090209130904L002M001 000 000CAADL
000020997259 20090209130904L002M002 000 000CATTRCHG
000020997259 20090209130904L003 000 000 000CA000000
000020997259 20090209130904L003M002 000 000CATTRCHG
000020997259 20090209130904L004 000 000 000CA000000
000020997259 20090209130904L004M001 000 000CAADL
000020997259 20090209130904L004M002 000 000CATTRCHG
000020997259 20090209130904T999 000 000 000CAPUROLA
000020997259 20090209130904T999M001 000 000CACST
000020997359 20090209130904H000 000 000 000CADI2009
000020997359 20090209130904H000A000 000 000CA
000020997359 20090209130904H000E000 000 000CA
000020997359 20090209130904H000N001 000 000CAINVLOA
000020997359 20090209130904H000N002 000 000CAINVFOR
000020997359 20090209130904H000T003 000 000CA090307
000020997359 20090209130904H000U001 000 000CATRACK-
000020997359 20090209130904H000U002 000 000CATRACK-
000020997359 20090209130904H000U003 000 000CATRACK-
000020997359 20090209130904H000U004 000 000CATRACK-
000020997359 20090209130904L001 000 000 000CA000000
000020997359 20090209130904L001M001 000 000CAADL
000020997359 20090209130904L002 000 000 000CA000000
000020997359 20090209130904L002M001 000 000CAADL
000020997359 20090209130904L003 000 000 000CA000000
000020997359 20090209130904L004 000 000 000CA000000
000020997359 20090209130904L004M001 000 000CAADL
000020997359 20090209130904T999 000 000 000CAPUROLA
000020997359 20090209130904T999M001 000 000CACSC
000020998659 20090209130904H000 000 000 000CADI2009
000020998659 20090209130904H000A000 000 000CA
000020998659 20090209130904H000E000 000 000CA
000020998659 20090209130904H000N001 000 000CAINVLOA
000020998659 20090209130904H000N002 000 000CAINVDOC
000020998659 20090209130904H000N003 000 000CAINVFOR
000020998659 20090209130904H000T004 000 000CA090307
000020998659 20090209130904H000U001 000 000CATRACK-
000020998659 20090209130904H000U002 000 000CATRACK-
000020998659 20090209130904L001 000 000 000CA000000
000020998659 20090209130904L001M001 000 000CAADL
000020998659 20090209130904L002 000 000 000CA000000
000020998659 20090209130904L002M001 000 000CAADL
000020998659 20090209130904T999 000 000 000CAPUROLA
000020998659 20090209130904T999M001 000 000CACST

Thankyou for your help again
# 11  
Old 03-27-2009
The character positions seem different in your example:

Code:
awk 'NR == 1 || substr($0,28,5) ~ /H... / && \
_++ == 50 { fn && close(fn); fn = "part_" ++c; _ = 1 }
{ print > fn }' infile

Login or Register to Ask a Question

Previous Thread | Next Thread

10 More Discussions You Might Find Interesting

1. UNIX for Beginners Questions & Answers

Splitting a file based on a pattern

Hi All, I am having a problem. I tried to extract the chunk of data and tried to fix I am not able to. Any help please Basically I need to remove the for , values after K, this is how it is now A,, B, C,C, D,D, 12/04/10,12/04/10, K,1,1,1,1,0,3.0, K,1,1,1,2,0,4.0,... (2 Replies)
Discussion started by: arunkumar_mca
2 Replies

2. Shell Programming and Scripting

Splitting a file based on line number

Hi I have a file with over a million lines (rows) and I want to split everything from 500,000 to a million into another file (to make the file smaller). Is there a simple command for this? Thank you Phil (4 Replies)
Discussion started by: phil_heath
4 Replies

3. Shell Programming and Scripting

Splitting file based on pattern and first character

I have a file as below pema.txt s2dhshfu dshfkdjh dshfd rjhfjhflhflhvflxhvlxhvx vlvhx sfjhldhfdjhldjhjhjdhjhjxhjhxjxh sjfdhdhfldhlghldhflhflhfhldfhlsh rjsdjh#error occured# skjfhhfdkhfkdhbvfkdhvkjhfvkhf sjkdfhdjfh#error occured# my requirement is to create 3 files frm the... (8 Replies)
Discussion started by: pema.yozer
8 Replies

4. Shell Programming and Scripting

Splitting file based on column values

Hi all, I have a file (say file.txt) which contains comma-separated rows. Each row has seven columns. Only column 4 or 5 (not both) can have empty values like "" in each line. Sample lines So, now i want all the rows that have column 4 as "" go in file1.txt and all the rows that have column... (8 Replies)
Discussion started by: jakSun8
8 Replies

5. Shell Programming and Scripting

Splitting a file based on context.

I have file as shown below. Would like to split the file based on the context of data. Like, split the content between "---- XXX Info ----" and " ---- YYY Info ----" to a file. When I try using below command, 2nd file contains all the info starting after first "---- YYYY Info ----" instance.... (8 Replies)
Discussion started by: webkid
8 Replies

6. UNIX for Dummies Questions & Answers

Splitting a file based on first 8 chars

I have an input file of this format <Date><other data> For example, 20081213aaaaaaaaa 20081213bbbbbbbbb 20081220ccccccccc 20081220ddddddddd 20081220eeeeeeeee 20081227ffffffffffffff The first 8 chars are date in YYYYMMDD formT. I need to split this file into n files where n is the... (9 Replies)
Discussion started by: paruthiveeran
9 Replies

7. Shell Programming and Scripting

Splitting a file based on two patterns

Hi there, I've an input file as follows: *START 1001 a1 1002 a2 1003 a3 1004 a4 *END *START 1001 b1 1002 b2 1004 b4 *END *START 1001 c1 1004 c4 *END (6 Replies)
Discussion started by: kbirde
6 Replies

8. Shell Programming and Scripting

Logic for file fetching based on date

Dear friends, I receive the following files into a FTP location on a daily basis -rw-r----- 1 guest ftp1 5021 Aug 19 09:03 CHECK_TEST_Extracts_20080818210000.zip -rw-r----- 1 guest ftp1 2437 Aug 20 05:15 CHECK_TEST_Extracts_20080819210000.zip -rw-r----- 1 guest ... (2 Replies)
Discussion started by: sureshg_sampat
2 Replies

9. Shell Programming and Scripting

Splitting file based on number of rows

Hi, I'm, new to shell scripting, I have a requirement where I have to split an incoming file into separate files each containing a maximum of 3 million rows. For e.g: if my incoming file say In.txt has 8 mn rows then I need to create 3 files, in which two will 3 mn rows and one will contain 2... (2 Replies)
Discussion started by: wahi80
2 Replies

10. Shell Programming and Scripting

splitting files based on text in the file

I need to split a file based on certain context inside the file. Is there a unix command that can do this? I have looked into split and csplit but it does not seem like those would work because I need to split this file based on certain text. The file has multiple records and I need to split this... (1 Reply)
Discussion started by: matrix1067
1 Replies
Login or Register to Ask a Question