bash: How to split up a file based on another?


 
Thread Tools Search this Thread
Top Forums Shell Programming and Scripting bash: How to split up a file based on another?
# 1  
Old 10-11-2011
bash: How to split up a file based on another?

I've got these 2 files, FILE.txt and SPLIT_BY.txt:

FILE.txt contents:
Code:
FILE01
FILE02
FILE03
FILE04
FILE05
FILE06
FILE07
FILE08
FILE09
FILE10
FILE11
FILE12
FILE13
FILE14
FILE15

SPLIT_BY.txt contents:
Code:
2
5
1
7

I'm trying to split up the contents of FILE.txt based on the value in SPLIT_BY.txt.
ie, the following would be created:

FILE1.txt (create FILE1.txt based on '2' in SPLIT_BY.txt)
Code:
FILE01
FILE02

FILE2.txt (create FILE1.txt based on '5' in SPLIT_BY.txt)
Code:
FILE03
FILE04
FILE05
FILE06
FILE07

FILE3.txt (create FILE1.txt based on '1' in SPLIT_BY.txt)
Code:
FILE08

FILE4.txt (create FILE1.txt based on '7' in SPLIT_BY.txt)
Code:
FILE09
FILE10
FILE11
FILE12
FILE13
FILE14
FILE15

I tried different variations of grep, sed, head, awk, and end up in a logical mess. Any help would be greatly appreciated. Thanks!

Moderator's Comments:
Mod Comment Video tutorial on how to use code tags in The UNIX and Linux Forums.

Last edited by radoulov; 10-11-2011 at 06:34 PM..
# 2  
Old 10-11-2011
Code:
awk 'NR == FNR {
  p || p = 1
  for (i = p; i < $1 + p; i++)
    nr[i] = "FILE" (t[NR]++ ? c : ++c) 
  p = i; next    
  }
{ 
  print > nr[FNR] 
  }' SPLIT_BY.txt  FILE.txt


Last edited by radoulov; 10-11-2011 at 07:37 PM..
# 3  
Old 10-12-2011
That's awesome radoulov. I discovered last night that it's a little more complex than I thought.

The SPLIT_BY.txt file will have 2 columns, the first is the volume_group_name and the second is still the split_by value.

ie, SPLIT_BY.txt contents:
Code:
vgaaa 2
vgbbb 5
vgccc 1
vgddd 7

End results would be:
FILE1.txt
Code:
vgcreate vgaaa FILE01
vgextend vgaaa FILE02

FILE2.txt
Code:
vgcreate vgbbb FILE03
vgextend vgbbb FILE04
vgextend vgbbb FILE05
vgextend vgbbb FILE06
vgextend vgbbb FILE07

FILE3.txt
Code:
vgcreate vgccc FILE08

FILE4.txt
Code:
vgcreate vgddd FILE09
vgextend vgddd FILE10
vgextend vgddd FILE11
vgextend vgddd FILE12
vgextend vgddd FILE13
vgextend vgddd FILE14
vgextend vgddd FILE15

The first 'FILE##' within 'FILE#.txt' will start off with a 'vgcreate' and IF there are more than one 'FILE##' it'll start off with 'vgextent', and so on for the rest. Is this possible? Thanks!
# 4  
Old 10-12-2011
Code:
awk 'NR == FNR {
  p || p = 1
  for (i = p; i < $2 + p; i++) {
    if (i > p) {
      nr[i]  = "FILE" c
      cmd[i] = "vgextend" FS $1 
      }
    else {  
      nr[i]  = "FILE" ++c
      cmd[i] = "vgcreate" FS $1
      }
    }    
  p = i; next    
  }
{ 
  print cmd[FNR], $0 > nr[FNR] 
  }' SPLIT_BY.txt  FILE.txt

You may hit the limit of open files with some awk implementations,
the below code keeps one file open for writing at a time.

Code:
awk 'NR == FNR {
  p || p = 1
  for (i = p; i < $2 + p; i++) {
    if (i > p) {
      nr[i]  = "FILE" c
      cmd[i] = "vgextend" FS $1 
      }
    else {  
      nr[i]  = "FILE" ++c
      cmd[i] = "vgcreate" FS $1
      }
    }    
  p = i; next    
  }
{ 
  print cmd[FNR], $0 > nr[FNR]
  t[nr[FNR]]++ || close(nr[FNR - 1])
  }' SPLIT_BY.txt  FILE.txt

# 5  
Old 10-12-2011
What a skill, thank you!
Login or Register to Ask a Question

Previous Thread | Next Thread

10 More Discussions You Might Find Interesting

1. Shell Programming and Scripting

Split the File based on Size

I have a file that is about 7 GB in size. The requirement is I should split the file equally in such a way that the size of the split files is less than 2Gb. If the file is less than 2gb, than nothing needs to be done. ( need to done using shell script) Thanks, (4 Replies)
Discussion started by: rudoraj
4 Replies

2. UNIX for Advanced & Expert Users

Split one file to many based on pattern

Hello All, I have records in a file in a pattern A,B,B,B,B,K,A,B,B,K Is there any command or simple logic I can pull out records into multiple files based on A record? I want output as File1: A,B,B,B,B,K File2: A,B,B,K (9 Replies)
Discussion started by: deal1dealer
9 Replies

3. Shell Programming and Scripting

Split File based on different conditions

I need to split the file Conditions: Ignore any record that either starts with 1 or 9 Split the file at position 404 , if position 404 is abc or def then write all the records in a file > File 1 , the remaining records should go in to a file > File 2 Further I want to split the... (7 Replies)
Discussion started by: protech
7 Replies

4. Shell Programming and Scripting

Split file based on records

I have to split a file based on number of lines and the below command works fine: split -l 2 Inputfile -d OutputfileMy input file contains header, detail and trailor info as below: H D D D D TMy split files for the above command contains: First File: H DSecond File: ... (11 Replies)
Discussion started by: Ajay Venkatesan
11 Replies

5. Shell Programming and Scripting

Split the file based on pattern

Hi , I have huge files around 400 mb, which has clob data and have diffeent scenarios: I am trying to pass scenario number as parameter and and get required modified file based on the scenario number and criteria. Scenario 1: file name : scenario_1.txt ... (2 Replies)
Discussion started by: sol_nov
2 Replies

6. Shell Programming and Scripting

Split file based on size

Hi Friends, Below is my requirement. I have a file with the below structure. 0001A1.... 0001B1.. .... 0001L1 0002A1 0002B1 ...... 0002L1 .. the first 4 characters are the sequence numbers for a record, A record will start with A1 and end with L1 with same sequence number. Now the... (2 Replies)
Discussion started by: diva_thilak
2 Replies

7. Shell Programming and Scripting

How to split file based on subtitle

Hi, unix Gurus, I want to split file based on sub_title. for example: original file fruit apple watermelon meat pork fish beef expected result file file1 fruit apple watermelon file2 meat pork fish beef. (4 Replies)
Discussion started by: ken002
4 Replies

8. Shell Programming and Scripting

Split the file based on date value

Hi frnds, I have flat file as . Say : output-file1.txt Output-file2.txt (1 Reply)
Discussion started by: Gopal_Engg
1 Replies

9. Shell Programming and Scripting

Split file based on field

Hi I have a large file 2.6 million records and I am trying to split the file based on last column. I am doing awk -F"|" '{ print > $NF }' filename1 After around 1000 splits it gives me a error awk: can't open file 3332332423 input record number 1068, file filename1 source... (6 Replies)
Discussion started by: s_adu
6 Replies

10. Shell Programming and Scripting

Split a file based on a pattern

Dear all, I have a large file which is composed of 8000 frames, what i would like to do is split the file into 8000 single files names file.pdb.1, file.pdb.2 etc etc each frame in the large file is seperated by a "ENDMDL" flag so my thinking is to use this flag a a point to split the files... (4 Replies)
Discussion started by: Mish_99
4 Replies
Login or Register to Ask a Question