Frustrating in splitting text files


 
Thread Tools Search this Thread
Top Forums Shell Programming and Scripting Frustrating in splitting text files
# 1  
Old 03-31-2018
Frustrating in splitting text files

Moderator's Comments:
Mod Comment Duplicate threads merged


Dear all,

I have been working with a very large text file manually. I'm ordering how to do this with a script. The gamma should be straightforward:

I just want split the text into multiple files. The file name should be "CP1", "TS1 for the second step", "PR1 for the product", stored as a single word or
a sentence without a clear pattern. However, the text 'coordinates' is in a strict format, that is Capital Letter "name of element", followed by three float numbers, given 4 exactly four columns in total. Any suggestion or comment would be greatly appreciated! Thanks!!

Zhen
Code:
CP1
Ir -0.77842700 -2.00617700 -0.99311000
C 0.10419200 -3.27273900 -0.26241700
N 0.20297200 -2.01968800 -2.23710000
H -1.68935400 -0.88975000 -1.81955400
P 0.43760100 -0.44588500 -0.07461300
TS1 for the second step
Ir -0.77842700 -2.00617700 -0.99311000
H 0.10419200 -3.27273900 -0.26241700
O 0.20297200 -2.01968800 -2.23710000
H -1.68935400 -0.88975000 -1.81955400
P 0.43760100 -0.44588500 -0.07461300
PR1 for the product
Ir -0.77842700 -2.00617700 -0.99311000
H 0.10419200 -3.27273900 -0.26241700
S 0.20297200 -2.01968800 -2.23710000
H -1.68935400 -0.88975000 -1.81955400
P 0.43760100 -0.44588500 -0.07461300

The output should be three sperated files:

cat "CP1"
Code:
Ir -0.77842700 -2.00617700 -0.99311000
C 0.10419200 -3.27273900 -0.26241700
N 0.20297200 -2.01968800 -2.23710000
H -1.68935400 -0.88975000 -1.81955400
P 0.43760100 -0.44588500 -0.07461300

cat "TS1 for the second step"
Code:
Ir -0.77842700 -2.00617700 -0.99311000
H 0.10419200 -3.27273900 -0.26241700
O 0.20297200 -2.01968800 -2.23710000
H -1.68935400 -0.88975000 -1.81955400
P 0.43760100 -0.44588500 -0.07461300

cat "PR1 for the product"
Code:
Ir -0.77842700 -2.00617700 -0.99311000
H 0.10419200 -3.27273900 -0.26241700
S 0.20297200 -2.01968800 -2.23710000
H -1.68935400 -0.88975000 -1.81955400
P 0.43760100 -0.44588500 -0.07461300


Last edited by jim mcnamara; 03-31-2018 at 11:39 PM..
# 2  
Old 03-31-2018
Frustrating in splitting text files

Dear all,

I have been working with a very large text file manually. I'm ordering how to do this with a script. The gamma should be straightforward:

I just want split the text into multiple files. The file name should be "CP1", "TS1 for the second step", "PR1 for the product", stored as a single word or
a sentence without a clear pattern. However, the text 'coordinates' is in a strict format, that is Capital Letter "name of element", followed by three float numbers, given 4 exactly four columns in total. Any suggestion or comment would be greatly appreciated! Thanks!!

Zhen
Code:
CP1
Ir -0.77842700 -2.00617700 -0.99311000
C 0.10419200 -3.27273900 -0.26241700
N 0.20297200 -2.01968800 -2.23710000
H -1.68935400 -0.88975000 -1.81955400
P 0.43760100 -0.44588500 -0.07461300
TS1 for the second step
Ir -0.77842700 -2.00617700 -0.99311000
H 0.10419200 -3.27273900 -0.26241700
O 0.20297200 -2.01968800 -2.23710000
H -1.68935400 -0.88975000 -1.81955400
P 0.43760100 -0.44588500 -0.07461300
PR1 for the product
Ir -0.77842700 -2.00617700 -0.99311000
H 0.10419200 -3.27273900 -0.26241700
S 0.20297200 -2.01968800 -2.23710000
H -1.68935400 -0.88975000 -1.81955400
P 0.43760100 -0.44588500 -0.07461300

The output should be three sperated files:

cat "CP1"
Code:
Ir -0.77842700 -2.00617700 -0.99311000
C 0.10419200 -3.27273900 -0.26241700
N 0.20297200 -2.01968800 -2.23710000
H -1.68935400 -0.88975000 -1.81955400
P 0.43760100 -0.44588500 -0.07461300

cat "TS1 for the second step"
Code:
Ir -0.77842700 -2.00617700 -0.99311000
H 0.10419200 -3.27273900 -0.26241700
O 0.20297200 -2.01968800 -2.23710000
H -1.68935400 -0.88975000 -1.81955400
P 0.43760100 -0.44588500 -0.07461300

cat "PR1 for the product"
Code:
Ir -0.77842700 -2.00617700 -0.99311000
H 0.10419200 -3.27273900 -0.26241700
S 0.20297200 -2.01968800 -2.23710000
H -1.68935400 -0.88975000 -1.81955400
P 0.43760100 -0.44588500 -0.07461300

# 3  
Old 04-01-2018
Given your entire large text file adheres to the structure of the sample posted, how far would this get you:
Code:
awk '!/[0-9]+\.[0-9]+/ {FN = $0; next} {print > FN}' file

# 4  
Old 04-02-2018
Depending on how many files you need to create RudiC's solution above may run out of open file handles. In which case you may need to modify it to close the previous file:

Code:
awk '!/[0-9]+\.[0-9]+/ {if (length(FN)) close(FN); FN = $0; next} {print > FN}' file

Login or Register to Ask a Question

Previous Thread | Next Thread

10 More Discussions You Might Find Interesting

1. UNIX for Beginners Questions & Answers

Automate splitting of files , scp files as each split completes and combine files on target server

i use the split command to split a one terabyte backup file into 10 chunks of 100 GB each. The files are split one after the other. While the files is being split, I will like to scp the files one after the other as soon as the previous one completes, from server A to Server B. Then on server B ,... (2 Replies)
Discussion started by: malaika
2 Replies

2. Shell Programming and Scripting

Splitting a text file into smaller files with awk, how to create a different name for each new file

Hello, I have some large text files that look like, putrescine Mrv1583 01041713302D 6 5 0 0 0 0 999 V2000 2.0928 -0.2063 0.0000 N 0 0 0 0 0 0 0 0 0 0 0 0 5.6650 0.2063 0.0000 N 0 0 0 0 0 0 0 0 0 0 0 0 3.5217 ... (3 Replies)
Discussion started by: LMHmedchem
3 Replies

3. Shell Programming and Scripting

Splitting a delimited text file

Howdy folks, I've got a very large plain text file that I need to split into many smaller files. My script-fu is not powerful enough for this, so any assistance is much appreciated. The file is a database dump from Cyrus IMAP server. It's basically a bunch of emails (thousands) all... (13 Replies)
Discussion started by: lupin..the..3rd
13 Replies

4. UNIX for Dummies Questions & Answers

Splitting up a text file into multiple files by columns

Hi, I have a space delimited text file with multiple columns 102 columns. I want to break it up into 100 files labelled 1.txt through 100.txt (n.txt). Each text file will contain the first two columns and in addition the nth column (that corresponds to n.txt). The third file will contain the... (1 Reply)
Discussion started by: evelibertine
1 Replies

5. Shell Programming and Scripting

Need help in splitting text

Hi I want shell script command to split text. Example : str = "int i=10 ; int j = 20 + i ; int k = 30" I want to result as int i = 10 # string 1 int j = 20 + i # string 2 int k = 30 # string 3 I tried awk -F ";" '{print $1}' but it... (2 Replies)
Discussion started by: jionnet
2 Replies

6. Linux

Splitting a Text File by Rows

Hello, Please help me. I have hundreds of text files composed of several rows of information and I need to separate each row into a new text file. I was trying to figure out how to split the text file into different text files, based on each row of text in the original text file. Here is an... (2 Replies)
Discussion started by: dvdrevilla
2 Replies

7. Shell Programming and Scripting

Splitting text file into 2 separate files ??

Hi All, I am new to this forumn as well to the UNIX, I have basic knowledge of UNIX which I studied some years ago, now I have to do some shell scripting to load data into Oracle database using sqlldr utility, whcih I am able to do. I have a requirement where I need to do following operation. I... (10 Replies)
Discussion started by: shekharjchandra
10 Replies

8. Shell Programming and Scripting

splitting text file into smaller ones

Hello We have a text file with 400,000 lines and need to split into multiple files each with 5000 lines ( will result in 80 files) Got an idea of using head and tail commands to do that with a loop but looked not efficient. Please advise the simple and yet effective way to do it. TIA... (3 Replies)
Discussion started by: prvnrk
3 Replies

9. Shell Programming and Scripting

Splitting text file to several other files using sed.

I'm trying to figure out how to do this efficiently with as little execution time as possible and I'm pretty sure using sed is the best way. However I'm new to sed and all the reading and examples I've found don't seem to show a similar exercise: I have a long text file (i'll call it... (3 Replies)
Discussion started by: JeffV
3 Replies

10. Shell Programming and Scripting

splitting files based on text in the file

I need to split a file based on certain context inside the file. Is there a unix command that can do this? I have looked into split and csplit but it does not seem like those would work because I need to split this file based on certain text. The file has multiple records and I need to split this... (1 Reply)
Discussion started by: matrix1067
1 Replies
Login or Register to Ask a Question