Split file at location of textpattern


Login or Register for Dates, Times and to Reply

 
Thread Tools Search this Thread
# 1  
Split file at location of textpattern

I have a file that I want to split in 2 (with Bourne shell sh) preferably. The file is a configuration file for several elements and hence consists of a repeated configuration pattern like this:

config.txt:
#fruit banana
#color yellow
#surface smooth
size 20cm

#fruit apple
#color green
#surface smooth
size 7cm

#fruit grape
#color green
#surface smooth
size 2cm

I want to split the file in 2 as equal as possible pieces but a split has to be done at the start of an element (starting with a #fruit entry). If the configuration file has an odd number of entries it should allow one more item in one of the files, and if not should split so that the 2 resulting files will have the same amount of items.

The tags like "#fruit" are unique so they can be used in e.g. "grep" combined with "wc -l" to find amount of items and at which element to split.

Is this a typical awk job?

Borgeh
# 2  
Hi.

How is this different from https://www.unix.com/shell-programmin...#post302137399
and what have you done so far to solve it? ... cheers, drl
# 3  
Quote:
Originally Posted by drl
Hi.

How is this different from https://www.unix.com/shell-programmin...#post302137399
and what have you done so far to solve it? ... cheers, drl
It's not different, but since I received no answer on that query I decided to write the problem in a different way since maybe it was difficult to understand what I meant. I have a feeling though that this should be an easy task to solve, but I am stuck. I have determined at which element I should split by grep'ing for "#fruit" to find number of elements and using "expr" and "/" to get the closest integer value of the number of the element where I should split. But from there I am unsure about the rest. I have a feeling that awk should be the way to go but I am not sure how. Another option is to find the line number of the start of the element where I should cut.
# 4  
grep for #fruit then get a count with wc -l. Your source is structured with 5 lines for each entry so divide the number of #fruit entries found by 2 then multiply that by 5 using bc. You can then use the split -l command to make your two files using those results. I would add something to make sure none of the lines go missing.
# 5  
Quote:
Originally Posted by tomas
grep for #fruit then get a count with wc -l. Your source is structured with 5 lines for each entry so divide the number of #fruit entries found by 2 then multiply that by 5 using bc. You can then use the split -l command to make your two files using those results. I would add something to make sure none of the lines go missing.
Thanks!
I think this is close to a way to do it. How can x5 help me to find correct place to cut?
Something like this might work:

- Filter out heading or trailing newlines to assure the count will be correct.
- Grep for "#fruit" and pipe it through "wc -l" to get amount of "#fruit" - elements.
- If number is even number I can split in middle

"wc -l"/2.

- If number is odd I can split at:

("wc -l"/2)+3 lines

And then I probably have to adjust 1 lines up or down to get the split exact.
Hmm...if this works I need to find out if a number is odd or even.

Borgeh
# 6  
Hi.

An awk script may be useful. There is a special variable "RS", Record Separator, that may be be set to read "paragraphs", i.e. groups of lines separated by an empty line:
Code:
RS = ""

That would allow you to treat your file as essentially just a number of such records.

With your calculated knowledge of where you want to the split to be, the "pattern" part of an awk statement:
Code:
  pattern { action }

should allow you to complete the solution with the use of another builtin variable "NR", Number of Record. This is because the pattern part may be a logical expression, such as:
Code:
NR <= 5 { some-action-for-this-case }

the action might be something as simple as print ... cheers, drl
# 7  
Code:
line="`cat filename|wc -l`"
echo "$line / 2" | bc | read value
csplit -f fruit config.txt '/#fruit/'+"$value"

Login or Register for Dates, Times and to Reply

Previous Thread | Next Thread
Thread Tools Search this Thread
Search this Thread:
Advanced Search

Test Your Knowledge in Computers #720
Difficulty: Medium
Alan Minsky was an adviser on Stanley Kubrick's movie 2001: A Space Odyssey.
True or False?

10 More Discussions You Might Find Interesting

1. Shell Programming and Scripting

Curl to download file from subdivx.com after following location without knowing the file name/extens

This question could be specific to the site subdivx.com In the past, I've been able to download a file following location using cURL but there is something about subdivx.com that's different and can't figure out how to get it to work. I tried the following directly in the terminal with no... (5 Replies)
Discussion started by: MoonD
5 Replies

2. Shell Programming and Scripting

Help with copying the list of files from one location to other location

A) I would like to achive following actions using shell script. can someone help me with writing the shell script 1) Go to some dir ( say /xyz/logs ) and then perform find operation in this dir and list of subdir using find . -name "*" -print | xargs grep -li 1367A49001CP0162 >... (1 Reply)
Discussion started by: GG2
1 Replies

3. Shell Programming and Scripting

How to find a existing file location and directory location in Solaris box?

Hi This is my third past and very impressed with previous post replies Hoping the same for below query How to find a existing file location and directory location in solaris box (1 Reply)
Discussion started by: buzzme
1 Replies

4. Shell Programming and Scripting

How to copy a file from one location to another location?

I have file file1.txt in location 'loc1'. Now i want a copy of this file in location 'loc2' with a new file called test.txt. Please help me how to do this in shell script. (1 Reply)
Discussion started by: vel4ever
1 Replies

5. Shell Programming and Scripting

File created in a different location instead of desired location on using crontab

Hi, I am logging to a linux server through a user "user1" in /home directory. There is a script in a directory in 'root' for which all permissions are available including the directory. This script when executed creates a file in the directory. When the script is added to crontab, on... (1 Reply)
Discussion started by: archana.n
1 Replies

6. Shell Programming and Scripting

Shell Script for Copy files from one location to another location

Create a script that copies files from one specified directory to another specified directory, in the order they were created in the original directory between specified times. Copy the files at a specified interval. (2 Replies)
Discussion started by: allways4u21
2 Replies

7. Shell Programming and Scripting

Put one string from one location to another location in a file

Hi Everyone, I have 1.txt here a b c' funny"yes"; d e The finally output is: here a b c d e' funny"yes"; (1 Reply)
Discussion started by: jimmy_y
1 Replies

8. UNIX for Advanced & Expert Users

copy files from one location to similar location

I need help in forming a script to copy files from one location which has a sub directory structure to another location with similar sub directory structure, say location 1, /home/rick/tmp_files/1-12/00-25/ here 1-12 are the number of sub directories under tmp_files and 00-25 are sub... (1 Reply)
Discussion started by: pharos467
1 Replies

9. UNIX for Dummies Questions & Answers

Split a file with no pattern -- Split, Csplit, Awk

I have gone through all the threads in the forum and tested out different things. I am trying to split a 3GB file into multiple files. Some files are even larger than this. For example: split -l 3000000 filename.txt This is very slow and it splits the file with 3 million records in each... (10 Replies)
Discussion started by: madhunk
10 Replies

10. Shell Programming and Scripting

Splitting av file in 2 at specific place based on textpattern

I have a file that I want to split in 2 (with Bourne shell sh) preferably. The file consists of groups of lines separated by newline. The file can vary in length, so I need to check number of groups of text. Here's an example ====EXAMPLE START==== #fruit banana #color yellow #surface smooth... (0 Replies)
Discussion started by: borgeh
0 Replies

Featured Tech Videos