[Solved] Split one file into more than one file


 
Thread Tools Search this Thread
Top Forums UNIX for Dummies Questions & Answers [Solved] Split one file into more than one file
# 1  
Old 02-12-2014
[Solved] Split one file into more than one file

Guys,

I have an input file

Code:
MGC001|108.28|-2.86489|100-120|MANGGAC
MGC002|108.071|-2.69028|80-100|KELAPA KAMPIT
MGC003|108.168|-2.97053|50-80|GANTUNG
MGC007|108.192722222|-2.766138889|0-50|KELAPA KAMPIT
MGC008|108.11075|-3.002666667|0-50|GANTUNG
MGC015|108.281555555556|-2.88105555555556|50-80|MANGGAC
MGC016|108.246|-2.876916667|50-80|MANGGAC
MGC019|108.187611111|-2.917472222|0-50|MANGGAC
MGC025|108.231722222|-2.806277778|50-80|MANGGAC

I want to print multiple files based on uniq string in column 5th
Based on data above (example : only 3 uniq string in column 5th but my datas have more than that..approx 30 uniq string ), my output should be

MANGGAC.txt
Code:
MGC001|108.28|-2.86489|100-120|MANGGAC
MGC015|108.281555555556|-2.88105555555556|50-80|MANGGAC
MGC016|108.246|-2.876916667|50-80|MANGGAC
MGC019|108.187611111|-2.917472222|0-50|MANGGAC
MGC025|108.231722222|-2.806277778|50-80|MANGGAC

KELAPA KAMPIT.txt
Code:
MGC002|108.071|-2.69028|80-100|KELAPA KAMPIT
MGC007|108.192722222|-2.766138889|0-50|KELAPA KAMPIT

GANTUNG.txt
Code:
MGC003|108.168|-2.97053|50-80|GANTUNG
MGC008|108.11075|-3.002666667|0-50|GANTUNG

Thanks
# 2  
Old 02-12-2014
A few questions for you:-
  • What have you tried so far?
  • What errors are you getting?
  • What OS and version are you using?
  • What are your preferred tools?

Most importantly, what have you tried so far?


I have a few suggestions, but it's better for you to explain where you are stuck so someone here can help.


Regards,
Robin
# 3  
Old 02-13-2014
I did this

Code:
grep -E "MANGGAC" input > MANGGAC.txt
grep -E "GANTUNG" input > GANTUNG.txt
grep -E "KELAPA KAMPIT" input > KELAPAKAMPIT.txt

I have to print all uniq string in column 5th and then execute through many grep -E.
Since I don't have list of uniq string in column 5th in my original data and the data consists of 100000 rows (more or less)

Any fastest solution on this?
Thanks
# 4  
Old 02-13-2014
Code:
awk -F\| '{f=$NF ".txt"; print $0 >> f;close(f)}' yourfile

This User Gave Thanks to ctsgnb For This Post:
# 5  
Old 02-13-2014
Quote:
Originally Posted by radius
Since I don't have list of uniq string in column 5th in my original data
Such a list is easy to create. Depending on what else you want to do you could (scripts not tested, may need a bit of adaption):
  • read the 5th column only and pipe it through "uniq", for instance:
    Code:
     while IFS="|" read junk junk junk junk VALUE ; do print - "$VALUE" ; done</path/to/input.file | uniq

    or
  • use "sort -u" and sort on the fifth field:
    Code:
    sort -t '|' -uk 5,5 /path/to/input.file


I hope this helps.

bakunin
# 6  
Old 02-13-2014
Quote:
Originally Posted by ctsgnb
Code:
awk -F\| '{f=$NF ".txt"; print $0 >> f;close(f)}' yourfile

Hi ctsgnb we can improve little bit

Code:
# First Run
[akshay@aix tmp]$ awk -F\| '{f=$NF ".txt"; print $0 >> f;close(f)}' file

[akshay@aix tmp]$ ls *.txt
GANTUNG.txt  KELAPA KAMPIT.txt  MANGGAC.txt

[akshay@aix tmp]$ wc -l *.txt
  2 GANTUNG.txt
  2 KELAPA KAMPIT.txt
  5 MANGGAC.txt
  9 total

# Second Run
[akshay@aix tmp]$ awk -F\| '{f=$NF ".txt"; print $0 >> f;close(f)}' file

[akshay@aix tmp]$ ls *.txt
GANTUNG.txt  KELAPA KAMPIT.txt  MANGGAC.txt

[akshay@aix tmp]$ wc -l *.txt
  4 GANTUNG.txt
  4 KELAPA KAMPIT.txt
 10 MANGGAC.txt
 18 total


Code:
$ awk -F\| '{ F = $5".txt"; if(F in A)print $0 >>F; else print >F; A[F]; close(F)}' file

This User Gave Thanks to Akshay Hegde For This Post:
# 7  
Old 02-13-2014
it works..really helpful

Moderator's Comments:
Mod Comment edit by bakunin: good - it would have been nice to post/describe the final solution, though! changed thread title to SOLVED.

Last edited by bakunin; 02-13-2014 at 07:21 AM..
 
Login or Register to Ask a Question

Previous Thread | Next Thread

10 More Discussions You Might Find Interesting

1. UNIX for Beginners Questions & Answers

sed awk: split a large file to unique file names

Dear Users, Appreciate your help if you could help me with splitting a large file > 1 million lines with sed or awk. below is the text in the file input file.txt scaffold1 928 929 C/T + scaffold1 942 943 G/C + scaffold1 959 960 C/T +... (6 Replies)
Discussion started by: kapr0001
6 Replies

2. UNIX for Dummies Questions & Answers

[Solved] Perl Question - split function with csv file

Hi all, I have a csv file that appears as follows: ,2013/03/26,2012/12/26,4,1,"2017/09/26,5.75%","2017/09/26,1,2018/09/26,1,2019/09/26,1,2020/09/26,1,2021/09/26,1",,,2012/12/26,now when i use the split function like this: my @f = split/,/; the split function will split the data that is... (2 Replies)
Discussion started by: WongSifu
2 Replies

3. Shell Programming and Scripting

How to split file into multiple files using awk based on 1 field in the file?

Good day all I need some helps, say that I have data like below, each field separated by a tab DATE NAME ADDRESS 15/7/2012 LX a.b.c 15/7/2012 LX1 a.b.c 16/7/2012 AB a.b.c 16/7/2012 AB2 a.b.c 15/7/2012 LX2 a.b.c... (2 Replies)
Discussion started by: alexyyw
2 Replies

4. Shell Programming and Scripting

[Solved] Split file using awk

hlow all, need your advice i have sample.txt 1252468812,yahoo,3.5 1252468812,hotmail,2.4 1252468819,yahoo,1.2 1252468812,msn,8.9 1252468923,gmail,12 1232468819,live,3.4 1252368929,yahoo,9.0 1252468929,msn,1.2now i want filtering with awk so output will like this 12524_log.txt... (2 Replies)
Discussion started by: zvtral
2 Replies

5. Shell Programming and Scripting

Split File by Pattern with File Names in Source File... Awk?

Hi all, I'm pretty new to Shell scripting and I need some help to split a source text file into multiple files. The source has a row with pattern where the file needs to be split, and the pattern row also contains the file name of the destination for that specific piece. Here is an example: ... (2 Replies)
Discussion started by: cul8er
2 Replies

6. Shell Programming and Scripting

How to split a data file into separate files with the file names depending upon a column's value?

Hi, I have a data file xyz.dat similar to the one given below, 2345|98|809||x|969|0 2345|98|809||y|0|537 2345|97|809||x|544|0 2345|97|809||y|0|651 9685|98|809||x|321|0 9685|98|809||y|0|357 9685|98|709||x|687|0 9685|98|709||y|0|234 2315|98|809||x|564|0 2315|98|809||y|0|537... (2 Replies)
Discussion started by: nithins007
2 Replies

7. Shell Programming and Scripting

[Solved] Script to split a file into two

Hi i have a file like a 12 b 13 c 14 d 15 I want to split it based on a blank line like in first file I should have a 12 b 13 and in the second file I have c 14 d 15 How can i do this? Any help will be greatly appreciated (5 Replies)
Discussion started by: prarat
5 Replies

8. Shell Programming and Scripting

Split one file to Multiple file with report basis in unix

Hi, Please help on this. i want split the below file(11020111.CLT) to more files with some condition. :b: 1) %s stating of the report 2) %e ending of the report example starting of the report: %sAEGONCA| |MUMBAI | :EXPC|N|D ending of the report %eAEGONCA| |MUMBAI | :EXPC 3)so the... (10 Replies)
Discussion started by: krbala1985
10 Replies

9. Shell Programming and Scripting

[solved] merging two files and writing to another file- solved

i have two files as file1: 1 2 3 file2: a b c and the output should be: file3: 1~a 2~b 3~c (1 Reply)
Discussion started by: mlpathir
1 Replies

10. UNIX for Dummies Questions & Answers

Split a file with no pattern -- Split, Csplit, Awk

I have gone through all the threads in the forum and tested out different things. I am trying to split a 3GB file into multiple files. Some files are even larger than this. For example: split -l 3000000 filename.txt This is very slow and it splits the file with 3 million records in each... (10 Replies)
Discussion started by: madhunk
10 Replies
Login or Register to Ask a Question