Login or Register to Ask a Question and Join Our Community


How to Split File based on String?


 
Thread Tools Search this Thread
Top Forums Shell Programming and Scripting How to Split File based on String?
# 1  
Old 08-15-2013
How to Split File based on String?

hi ,


The scenario is like this,

i have a large text files (max 5MB , about 5000 file per day ),
Inside almost each line of this file there is a tag 3100.2.22.1 (represent Call_Type) , i need to generate many filess , each one with distinct (3100.2.22.1 Call_Type ) , and one more file to collect all lines without (3100.2.22.1 Call_Type)


the question is how can i split that file by using bash/sed/awk.

sample file hd_auto_22700123_0021 content (there are alot of Call_Type ) ;
Code:


Code:
! HISTORICAL DATA ! ONE FILE DECODING REPORT ! SERVICE : ce20 ! FILE : /osp/spm/svc/ !
! TICKET NBR : 1 ! GSI : 102 ! 3100.2.22.3 0 ! 3100.2.137.4 665004551 ! 3100.2.22.8 Browsing !
! TICKET NBR : 2 ! GSI : 102 ! 3100.2.137.4 665017728 !3100.2.22.2 7 ! 3100.2.70.8 1050 ! 3100.2.22.1 189 ! 3100.2.70.11 016c6f63000431333000 !
! TICKET NBR : 3 ! GSI : 102 ! 3100.2.137.4 665017728 ! 3100.2.97.1 192.168.0.12 ! 3100.2.19.2 665017728 ! 3100.2.22.2 7 ! 3100.2.70.11 016c6f63000431333000 !
! TICKET NBR : 4 ! GSI : 102 ! 3100.2.137.4 665002105 ! 3100.2.97.1 192.168.0.12 ! 3100.2.19.2 665002105 ! 3100.2.22.1 410 ! 3100.2.70.11 016c6f63000431333000 !
! TICKET NBR : 5 ! GSI : 102 ! 3100.2.22.3 0 ! 3100.2.137.4 665009058 ! 3100.2.97.1 192.168.0.12 ! 3100.2.22.1 164 ! 3100.2.70.11 016c6f63000431333000 !
! TICKET NBR : 6 ! GSI : 102 ! 3100.2.22.3 0 ! 3100.2.137.4 665012633 ! 3100.2.97.1 192.168.0.12 ! 3100.2.18.1 0 ! 3100.2.22.1 189 ! 3100.2.70.11 016c6f63000431333000 !
! TICKET NBR : 7 ! GSI : 102 ! 3100.2.22.3 0 ! 3100.2.137.4 665019277 ! 3100.2.97.1 192.168.0.12 ! 3100.2.22.1 164 ! 3100.2.70.11 016c6f63000431333000 !   
! TICKET NBR : 8 ! GSI : 102 ! 3100.2.112.1 15/08/2013 10:42:43 ! 3100.2.22.8 Free_Traffic ! 3100.2.97.1 192.168.0.12  ! 3100.2.22.11 2 !
.
.
.
! RESULT = successfull 1657 tickets treated !


the result of split should look likes below ,

Code:
hd_auto_22700123_0021_without_tag  (without 3100.2.22.1 tag)
! TICKET NBR : 1 ! GSI : 102 ! 3100.2.22.3 0 ! 3100.2.137.4 665004551 ! 3100.2.22.8 Browsing !
! TICKET NBR : 3 ! GSI : 102 ! 3100.2.137.4 665017728 ! 3100.2.97.1 192.168.0.12 ! 3100.2.19.2 665017728 ! 3100.2.22.2 7 ! 3100.2.70.11 016c6f63000431333000 !
! TICKET NBR : 8 ! GSI : 102 ! 3100.2.112.1 15/08/2013 10:42:43 ! 3100.2.22.8 Free_Traffic ! 3100.2.97.1 192.168.0.12  ! 3100.2.22.11 2 !
! RESULT = successfull 3 tickets treated !

Code:
hd_auto_22700123_0021_189 (with tag 3100.2.22.1 189)
! TICKET NBR : 2 ! GSI : 102 ! 3100.2.137.4 665017728 !3100.2.22.2 7 ! 3100.2.70.8 1050 ! 3100.2.22.1 189 ! 3100.2.70.11 016c6f63000431333000 !
! TICKET NBR : 6 ! GSI : 102 ! 3100.2.22.3 0 ! 3100.2.137.4 665012633 ! 3100.2.97.1 192.168.0.12 ! 3100.2.18.1 0 ! 3100.2.22.1 189 ! 3100.2.70.11 016c6f63000431333000 !
! RESULT = successfull 2 tickets treated !

Code:
hd_auto_22700123_0021_410 (with tag 3100.2.22.1 410)
! TICKET NBR : 4 ! GSI : 102 ! 3100.2.137.4 665002105 ! 3100.2.97.1 192.168.0.12 ! 3100.2.19.2 665002105 ! 3100.2.22.1 410 ! 3100.2.70.11 016c6f63000431333000 !
! RESULT = successfull 1 tickets treated !

Code:
hd_auto_22700123_0021_164 (with tag 3100.2.22.1 164)
! TICKET NBR : 5 ! GSI : 102 ! 3100.2.22.3 0 ! 3100.2.137.4 665009058 ! 3100.2.97.1 192.168.0.12 ! 3100.2.22.1 164 ! 3100.2.70.11 016c6f63000431333000 !
! TICKET NBR : 7 ! GSI : 102 ! 3100.2.22.3 0 ! 3100.2.137.4 665019277 ! 3100.2.97.1 192.168.0.12 ! 3100.2.22.1 164 ! 3100.2.70.11 016c6f63000431333000 !
! RESULT = successfull 2 tickets treated !


Last edited by OTNA; 08-15-2013 at 11:06 AM..
# 2  
Old 08-15-2013
Try
Code:
awk -F! 'match ($0, "3100.2.22.1[^!]*") {print >FILENAME " " substr ($0, RSTART, RLENGTH); next}
                                        {print >FILENAME " without_tag"}
        ' hd_auto_*

This User Gave Thanks to RudiC For This Post:
# 3  
Old 08-15-2013
Wow , thank you
can you please explain how this script doing this magic
# 4  
Old 08-16-2013
It tries to match the entire record to your 3100... plus call type represented by a regex. If found, RSTART and RLENGTH (see man awk) are sufficient to locate the whole string and extract it for use as a filename, to which the entire record then is printed. If no match, print to "without" file.
I see now that the -F! is not needed at all...
Login or Register to Ask a Question

Previous Thread | Next Thread

10 More Discussions You Might Find Interesting

1. Shell Programming and Scripting

Split the File based on Size

I have a file that is about 7 GB in size. The requirement is I should split the file equally in such a way that the size of the split files is less than 2Gb. If the file is less than 2gb, than nothing needs to be done. ( need to done using shell script) Thanks, (4 Replies)
Discussion started by: rudoraj
4 Replies

2. UNIX for Advanced & Expert Users

Split one file to many based on pattern

Hello All, I have records in a file in a pattern A,B,B,B,B,K,A,B,B,K Is there any command or simple logic I can pull out records into multiple files based on A record? I want output as File1: A,B,B,B,B,K File2: A,B,B,K (9 Replies)
Discussion started by: deal1dealer
9 Replies

3. Shell Programming and Scripting

Split File based on different conditions

I need to split the file Conditions: Ignore any record that either starts with 1 or 9 Split the file at position 404 , if position 404 is abc or def then write all the records in a file > File 1 , the remaining records should go in to a file > File 2 Further I want to split the... (7 Replies)
Discussion started by: protech
7 Replies

4. Shell Programming and Scripting

A command to split a file into two based on a string

Hello What command can i use to split a tab delimited txt file into two files base on the occurrence of a string my file name is EDIT.txt The content of file is below XX 1234 PROCEDURES XY 1634 PROCEDURES XM 1245 CODES XZ 1256 CODES It has more than a million record If there is... (16 Replies)
Discussion started by: madrazzii
16 Replies

5. Shell Programming and Scripting

KSH: Split String into smaller substrings based on count

KSH HP-SOL-Lin Cannot use xAWK I have several strings that are quite long and i want to break them down into smaller substrings. What I have String = "word1 word2 word3 word4 .....wordx" What I want String1="word1 word2" String2="word 3 word4" String3="word4 word5" Stringx="wordx... (5 Replies)
Discussion started by: nitrobass24
5 Replies

6. Shell Programming and Scripting

Split file based on size

Hi Friends, Below is my requirement. I have a file with the below structure. 0001A1.... 0001B1.. .... 0001L1 0002A1 0002B1 ...... 0002L1 .. the first 4 characters are the sequence numbers for a record, A record will start with A1 and end with L1 with same sequence number. Now the... (2 Replies)
Discussion started by: diva_thilak
2 Replies

7. Shell Programming and Scripting

How to split file based on subtitle

Hi, unix Gurus, I want to split file based on sub_title. for example: original file fruit apple watermelon meat pork fish beef expected result file file1 fruit apple watermelon file2 meat pork fish beef. (4 Replies)
Discussion started by: ken002
4 Replies

8. Shell Programming and Scripting

Split the file based on date value

Hi frnds, I have flat file as . Say : output-file1.txt Output-file2.txt (1 Reply)
Discussion started by: Gopal_Engg
1 Replies

9. Shell Programming and Scripting

Split file based on field

Hi I have a large file 2.6 million records and I am trying to split the file based on last column. I am doing awk -F"|" '{ print > $NF }' filename1 After around 1000 splits it gives me a error awk: can't open file 3332332423 input record number 1068, file filename1 source... (6 Replies)
Discussion started by: s_adu
6 Replies

10. Shell Programming and Scripting

How to split the String based on condition?

hi , I have a String str="/opt/ibm/lotus/ibw/latest" or ="/opt/lotus/ibw/latest" this value is dynamic..I want to split this string into 2 strings 1. /opt/ibm/lotus(/opt/lotus) this string must ends with "lotus" 2./ibw/latest can any body help me on this? Regards, sankar (2 Replies)
Discussion started by: sankar reddy
2 Replies
Login or Register to Ask a Question

Featured Tech Videos