sed awk: split a large file to unique file names


 
Thread Tools Search this Thread
Top Forums UNIX for Beginners Questions & Answers sed awk: split a large file to unique file names
Prev   Next
# 1  
Old 08-24-2016
sed awk: split a large file to unique file names

Dear Users,


Appreciate your help if you could help me with splitting a large file > 1 million lines with sed or awk. below is the text in the file
input file.txt
Code:
scaffold1       928     929     C/T     +
scaffold1       942     943     G/C     +
scaffold1       959     960     C/T     +
scaffold1       994     995     G/A     +
scaffold2       1024    1025    G/A     +
scaffold2       1065    1066    G/A     +
scaffold2       1356    1357    C/T     +
scaffold2       1363    1364    G/A     +
scaffold3       1367    1368    G/A     +
scaffold3       1403    1404    G/A     +
scaffold3       1404    1405    C/T     +
scaffold3       1433    1434    G/A     +
scaffold3       1467    1468    G/A     +
scaffold4       1521    1522    G/A     +
scaffold4       63885   63886   T/G     +
scaffold4       63907   63908   G/A     +
scaffold4       63942   63943   T/C     +
scaffold4       63964   63965   G/A     +
scaffold5       63996   63997   G/A     +
scaffold5       63997   63998   T/C     +
scaffold5       64074   64075   G/T     +
scaffold100       64076   64077   C/T     +
scaffold100       64127   64128   C/T     +
scaffold120       64221   64222   A/G     +
scaffold1100       64222   64223   T/C     +
scaffold1890       64263   64264   C/T     +
scaffold2000       64281   64282   G/C     +
scaffold2001       64292   64293   C/T     +
scaffold2002      64343   64344   G/A     +
scaffold2003       64347   64348   G/T     +

my output file should be unique to the first column name
output files
file1.txt
Code:
scaffold1       928     929     C/T     +
scaffold1       942     943     G/C     +
scaffold1       959     960     C/T     +
scaffold1       994     995     G/A     +

file2.txt
Code:
scaffold2       1024    1025    G/A     +
scaffold2       1065    1066    G/A     +
scaffold2       1356    1357    C/T     +
scaffold2       1363    1364    G/A     +

file2.txt
Code:
scaffold3       1367    1368    G/A     +
scaffold3       1403    1404    G/A     +
scaffold3       1404    1405    C/T     +
scaffold3       1433    1434    G/A     +
scaffold3       1467    1468    G/A     +

and so on.

Thank you,
kapr0001



Moderator's Comments:
Mod Comment Please use CODE tags as required by forum rules!

Last edited by RudiC; 08-24-2016 at 04:13 PM.. Reason: Added CODE tags.
 
Login or Register to Ask a Question

Previous Thread | Next Thread

10 More Discussions You Might Find Interesting

1. Linux

Split a large textfile (one file) into multiple file to base on ^L

Hi, Anyone can help, I have a large textfile (one file), and I need to split into multiple file to break each file into ^L. My textfile ========== abc company abc address abc contact ^L my company my address my contact my skills ^L your company your address ========== (3 Replies)
Discussion started by: fspalero
3 Replies

2. Shell Programming and Scripting

sed and awk not working on a large record file

Hi All, I have a very large single record file. abc;date||bcd;efg|......... pqr;stu||record_count;date when i do wc -l on this file it gives me "0" records, coz of missing line feed. my problem is there is an extra pipe that is coming at the end of this record like... (6 Replies)
Discussion started by: Gurkamal83
6 Replies

3. Shell Programming and Scripting

Change unique file names into new unique filenames

I have 84 files with the following names splitseqs.1, spliseqs.2 etc. and I want to change the .number to a unique filename. E.g. change splitseqs.1 into splitseqs.7114_1#24 and change spliseqs.2 into splitseqs.7067_2#4 So all the current file names are unique, so are the new file names.... (1 Reply)
Discussion started by: avonm
1 Replies

4. Shell Programming and Scripting

Split File by Pattern with File Names in Source File... Awk?

Hi all, I'm pretty new to Shell scripting and I need some help to split a source text file into multiple files. The source has a row with pattern where the file needs to be split, and the pattern row also contains the file name of the destination for that specific piece. Here is an example: ... (2 Replies)
Discussion started by: cul8er
2 Replies

5. Shell Programming and Scripting

How to split a data file into separate files with the file names depending upon a column's value?

Hi, I have a data file xyz.dat similar to the one given below, 2345|98|809||x|969|0 2345|98|809||y|0|537 2345|97|809||x|544|0 2345|97|809||y|0|651 9685|98|809||x|321|0 9685|98|809||y|0|357 9685|98|709||x|687|0 9685|98|709||y|0|234 2315|98|809||x|564|0 2315|98|809||y|0|537... (2 Replies)
Discussion started by: nithins007
2 Replies

6. UNIX for Dummies Questions & Answers

Get List of Unique File Names

I have a large directory of web pages. I am doing a search through the web pages using grep and would like to get a list of unique file names of search results. The following command works fine to give me a list of file names where term appears: grep -l term *.html However, since these are... (3 Replies)
Discussion started by: rjulich
3 Replies

7. Shell Programming and Scripting

Updating a line in a large csv file, with sed/awk?

I have an extremely large csv file that I need to search the second field, and upon matches update the last field... I can pull the line with awk.. but apparently you cant use awk to directly update the file? So im curious if I can use sed to do this... The good news is the field I want to... (5 Replies)
Discussion started by: trey85stang
5 Replies

8. Shell Programming and Scripting

extract unique pattern from large text file

Hi All, I am trying to extract data from a large text file , I want to extract lines which contains a five digit number followed by a hyphen , like 12345- , i tried with egrep ,eg : egrep "+" text.txt but which returns all the lines which contains any number of digits followed by hyhen ,... (19 Replies)
Discussion started by: shijujoe
19 Replies

9. UNIX for Dummies Questions & Answers

split a file with unique sets

This may sound like a trivial problem, but I still need some help: I have a file with ids and I want to split it 'n' ways (could be any number) into files: 1 1 1 2 2 3 3 4 5 5 Let's assume 'n' is 3, and we cannot have the same id in two different partitions. So the partitions may... (8 Replies)
Discussion started by: ChicagoBlues
8 Replies

10. Shell Programming and Scripting

Split large file and add header and footer to each file

I have one large file, after every 200 line i have to split the file and the add header and footer to each small file? It is possible to add different header and footer to each file? (1 Reply)
Discussion started by: ashish4422
1 Replies
Login or Register to Ask a Question