File breaking


 
Thread Tools Search this Thread
Top Forums Shell Programming and Scripting File breaking
# 1  
Old 10-21-2010
File breaking

Hey,

I have to take one CSV file and break into more files. Let's I have a file prices.csv and the data in the file like

1,12345
1,34567
1,23456
2,67890
2,77720
2,44556
2,55668
10,44996

based on the first column, I want to create files. in this example 1 is repeated three times and create file groupnumber_1.csv and this file has data only belongs to 1
groupnumber_1.csv
1,12345
1,34567
1,23456

and one more file I have to create groupnumber_2.csv
2,67890
2,77720
2,44556
2,55668

one more file groupnumber_10.csv
10,44996

etc this way I have to create as many as csv files based on ths first column.

Please help me with the shell script for this.
# 2  
Old 10-21-2010
Code:
#  nawk '{f="groupnumber_"$1".csv";print $0>f}' FS="," infile

#  head groupnumber_*
==> groupnumber_1.csv <==
1,12345
1,34567
1,23456

==> groupnumber_10.csv <==
10,44996

==> groupnumber_2.csv <==
2,67890
2,77720
2,44556
2,55668

# 3  
Old 10-21-2010
Code:
awk -F, 'OFS, (++A[$1]) {print $0 > "groupnumber_"$1".csv"}' prices.csv

@Scru1Linizer : pls don't laugh about my solution Smilie



---------- Post updated at 06:39 PM ---------- Previous update was at 06:31 PM ----------


Code:
awk -F, '{print $0 >"groupnumber_"$1".csv"}' prices.csv


Last edited by ctsgnb; 10-21-2010 at 01:45 PM..
# 4  
Old 10-21-2010
MySQL Thanks

Thanks for your quick responses. This forum is really helpful to me.
# 5  
Old 10-21-2010
A pitfall of these solutions is that they will fail if they reach the open file resource limit. If there are many files to be created or if the alllowed number of open files per process is low and cannot be raised, they may require the addition of a close after the print.

Just something to keep in mind just in case.

Regards,
Alister
# 6  
Old 10-21-2010
Dude Alister ... always nitpicking ... but still always true ! Smilie

Under solaris :

rlim_fd_max

Description Specifies the “hard” limit on file descriptors that a single process might
have open.Overriding this limit requires superuser privilege.
Data Type Signed integer
Default 65,536
Range 1 to MAXINT
Units File descriptors
Dynamic? No
Validation None
When to Change When the maximum number of open files for a process is not enough.
Other limitations in system facilities can mean that a larger number of
file descriptors is not as useful as it might be. For example:
■ A 32-bit program using standard I/O is limited to 256 file
descriptors. A 64-bit program using standard I/O can use up to 2
billion descriptors. Specifically, standard I/O refers to the
stdio(3C) functions in libc(3LIB).
■ select is by default limited to 1024 descriptors per fd_set. For
more information, see select(3C). Starting with the Solaris 7
release, 32-bit application code can be recompiled with a larger
fd_set size (less than or equal to 65,536). A 64-bit application uses
an fd_set size of 65,536, which cannot be changed.
# 7  
Old 11-24-2010
small change

nawk '{f="groupnumber_"$1".csv";print $0>f}' FS="," infile


Thanks

This is good. but I am trying to remove the First column after the files are generated. can we add anything to the above command. can you help me in this. Plz?
Login or Register to Ask a Question

Previous Thread | Next Thread

10 More Discussions You Might Find Interesting

1. Shell Programming and Scripting

Breaking large file into small files

Dear all, I have huge txt file with the input files for some setup_code. However for running my setup_code, I require txt files with maximum of 1000 input files Please help me in suggesting way to break down this big txt file to small txt file of 1000 entries only. thanks and Greetings, Emily (12 Replies)
Discussion started by: emily
12 Replies

2. Shell Programming and Scripting

Breaking lines which contains more than 50 characters in a file

Hi, I have a file which contains many lines. Some of them are longer than 50 chars. I want to break those lines but I don't want to break words, e.g. the file This is an exemplary text which should be broken aaaaaa bbbbb ccccc This is the second line This line should also be broken... (3 Replies)
Discussion started by: wenclu
3 Replies

3. UNIX for Dummies Questions & Answers

Breaking a fasta formatted file into multiple files containing each gene separately

Hey, I've been trying to break a massive fasta formatted file into files containing each gene separately. Could anyone help me? I've tried to use the following code but i've recieved errors every time: for i in *.rtf.out do awk '/^>/{f=++d".fasta"} {print > $i.out}' $i done (1 Reply)
Discussion started by: Ann Mc Cartney
1 Replies

4. UNIX for Dummies Questions & Answers

Breaking up a text file into lines

Hi, I have a space delimited text file that looks like the following: BUD31 YRI 2e-06:CXorf15 YRI 3e-06:CREB1 YRI 4e-06 FLJ21438 CEU 3e-07:ETS1 CEU 8e-07:FGD3 CEU 2e-06 I want to modify the text file so that everytime there is a ":", a new line is introduced so that the document looks... (3 Replies)
Discussion started by: evelibertine
3 Replies

5. Shell Programming and Scripting

'for LINE in $(cat file)' breaking at spaces, not just newlines

Hello. I'm making a (hopefully) simple shell script xml parser that outputs a file I can grep for information. I am writing it because I have yet to find a command line utility that can do this. If you know of one, please just stop now and tell me about it. Even better would be one I can input... (10 Replies)
Discussion started by: natedawg1013
10 Replies

6. Shell Programming and Scripting

Breaking the files as 10k recs. per file

Hi, I have a code as given below Set -A _Category="A\ B\ C" for _cat in ${_Category} do sed -e "s:<TABLE_NAME>:${_cat}:g" \ -e "s:<date>:${_dt}:g" \ ${_home}/skl/sq1.sql >> ${_dest}/del_${_dt}.sql fi ... (4 Replies)
Discussion started by: mr_manii
4 Replies

7. Shell Programming and Scripting

Breaking a file into three new files, character by character

I am new to shell scripting, and need a script to randomly distribute each character from a file into one of three new files. I also need each character to maintain it's position from the original file in the new file (such that if a character is written to File 1, Files 2 and 3 have spaces... (10 Replies)
Discussion started by: foxcastle
10 Replies

8. Shell Programming and Scripting

Breaking one file into many files based on first column?

Hi, I have a file that looks like this (tab deliminited). MAT1 YKR2 3 MAT1 YMR1 2 MAT1 YFG2 2 MAT2 YLM4 4 MAT2 YHL2 1 BAR1 YKR2 3 BAR1 YFR1 4 BAR1 YMR1 1 What I want to do is break this file down into multiple files. So the result will look like this: File 1... (2 Replies)
Discussion started by: kylle345
2 Replies

9. Shell Programming and Scripting

Breaking up a file

Hi, I have a file that looks like this - lets call it fileA >hhm2 IIIIIIIIILLLLLLLMMMMMMMMMNNNNNNNNNNGGGGGGHHHHHHHH >hhm4 OOOOOKKKKKKKKMMMMMHHHHHLLLLLLLLWWWWWWWWWWW >hhm9 OOOOOOOIIIIIIIIIKKKKKKKKKMMMMMHHHHHHHHHHHLLLLLLLLLL So the file is pretty straight forward. The name is indicated... (2 Replies)
Discussion started by: phil_heath
2 Replies

10. UNIX for Dummies Questions & Answers

newb help! file name with spaces breaking up when trying to retrieve it

for file in `ls *.txt` do sed '/s/old/new/g' $file > /tmp/tempfile.tmp mv /tmp/tempfile.tmp $file done the txt files names look like "text file one.txt", "text file two.txt" but when I run it, all i get is: sed: 0602-419 Cannot find or open file text. sed: 0602-419 Cannot find or... (3 Replies)
Discussion started by: DeuceLee
3 Replies
Login or Register to Ask a Question