Sponsored Content
Top Forums Shell Programming and Scripting Non trivial file splitting, saving with variable filename Post 302821725 by samask on Saturday 15th of June 2013 11:00:23 AM
Old 06-15-2013
Non trivial file splitting, saving with variable filename

Hello,

Although I have found similar questions, I could not find advice that could help with our problem.

The issue:

We have a few thousands text files (books).

Each book has many chapters. Each chapter is identified by a cite-key. We need
to split each of those book files by chapters, having each chapter's cite-key as
file name.

Example of book file:

Code:
* Chapter 1 -- Branchial or Visceral Arches

  :PROPERTIES:
  :GENRE: biology
  :CITE-KEY: DW:1
  :END:


The Branchial or Visceral Arches and Pharyngeal Pouches. -- In
the lateral walls of the anterior part of the fore-gut five pharyngeal
pouches appear (Fig. 42).



* Chapter 2 -- Dorsal and Ventral Diverticulum

  :PROPERTIES:
  :GENRE: biology
  :CITE-KEY: DW:2
  :END:


Each of the upper four pouches is prolonged into a dorsal and a ventral
diverticulum.

Over these pouches corresponding indentations of the ectoderm occur, forming 
what are known as the branchial or outer pharyngeal grooves.


[etc.]

After splitting, we would have a series of files, in same directory as the source:
dw-1.txt, dw-2.txt, etc., each containing only the proper chapter.

As example, file dw-2.txt would contain:

Code:
* Chapter 2

  :PROPERTIES:
  :GENRE: biology
  :CITE-KEY: DW:2
  :END:


Each of the upper four pouches is prolonged into a dorsal and
a ventral diverticulum.

Over these pouches corresponding indentations of the ectoderm occur,
forming what are known as the branchial or outer pharyngeal grooves.

One may notice those files use org-syntax. We are able to split those files
mapping a function with emacs' (org-map-entries), but the process is way too
slow. The text files do change, and we need to split all the books frequently.
Emacs is way too slow for that.


Could anybody give me a hint on how to do that with awk or some other fast
shell scripting?


Thank you very much.
 

10 More Discussions You Might Find Interesting

1. UNIX for Dummies Questions & Answers

Moving files by splitting the path embedded in the filename

Hello All. I am having a directory /tmp/rahul which contains many files in the format @#home@#rahul@#programs@#script.pl where /home/rahul/programs is the directory where the script.pl file is to be placed. I have many files in this format. What i want is a script which read these... (7 Replies)
Discussion started by: rahulrathod
7 Replies

2. UNIX for Dummies Questions & Answers

File Transfer that is not so trivial I guess

I have three computers A, B and C. To login to B and C I should use A because it has a SSH key. I don't have any other way of accessing these two computers. Now, if I need to transfer a file between B and C, I am unable to find a way that would work... because I don't know how to authenticate... (1 Reply)
Discussion started by: Legend986
1 Replies

3. Shell Programming and Scripting

Piping to a file and setting filename using a variable

Hi all, I would like to send the output of a line in a ksh script to a file, but I need to name the file using a predefined variable: ls -l > $MYVAR.arc But what is the correct syntax for achieving this? I can't seem to find the correct syntax for giving the file an extension. Any... (8 Replies)
Discussion started by: mandriver
8 Replies

4. UNIX for Dummies Questions & Answers

saving command output to a variable

Hello, I have a shell script containing a command string in the following format: command1 | command2 | cut -c9-16 The output from this is a record number (using characters 9-16 of the original output string) e.g. ORD-1234 I wish to save this value to a variable for use in later commands... (4 Replies)
Discussion started by: philjo
4 Replies

5. Shell Programming and Scripting

Filename from splitting files to have the same filename of the original file with counter value

Hi all, I have a list of xml file. I need to split the files to a different files when see the <ko> tag. The list of filename are B20090908.1100-20090908.1200_CDMA=1,NO=2,SITE=3.xml B20090908.1200-20090908.1300_CDMA=1,NO=2,SITE=3.xml B20090908.1300-20090908.1400_CDMA=1,NO=2,SITE=3.xml ... (3 Replies)
Discussion started by: natalie23
3 Replies

6. Shell Programming and Scripting

Trouble saving variable

Hi, I have problems when you save a variable of a command. I have put the following line: CONEXION_BAGDAD = $ (grep-c "Please login with USER and PASS" $ LOG_FILE_BAGDAD) But I returned the following error: syntax error at line 67: `CONEXION_BAGDAD = $ 'unexpected Because it can happen?... (2 Replies)
Discussion started by: danietepa
2 Replies

7. Homework & Coursework Questions

Matlab help! Reading in a file with a variable filename

1. The problem statement, all variables and given/known data: I want to read in a file, and plot the data in matlab. However, I do not like hardwiring filenames into my codes, so I always give the user the option to specify what the filename is. I am pretty inexperienced with matlab, so I have no... (0 Replies)
Discussion started by: ds7202
0 Replies

8. Shell Programming and Scripting

Trivial perl doubt about FILE

Hi, In the following perl code: #!/usr/bin/perl -w if (open(FILE, "< in_file")) { while (<FILE>) { chomp($_); if ($_ =~ /patt$/) { my $f = (split(" ", $_)); print "$f\n"; } } close FILE; } Why changing the "FILE" as... (4 Replies)
Discussion started by: royalibrahim
4 Replies

9. Open Source

Splitting files using awk and reading filename value from input data

I have a process that requires me to read data from huge log files and find the most recent entry on a per-user basis. The number of users may fluctuate wildly month to month, so I can't code for it with names or a set number of variables to capture the data, and the files are large so I don't... (7 Replies)
Discussion started by: rbatte1
7 Replies

10. Shell Programming and Scripting

Saving Mod in a variable

Hello Experts, In one of my shell script, I've been trying to calculate mod and saving it in a variable, below is what I have tried but it isn't working. Any help appreciated!!! #!/bin/bash num1=4 num2=3 echo "Number one is $num1" echo "Number two is $num2" mod_final=$(( echo "num1%num2"... (7 Replies)
Discussion started by: mukulverma2408
7 Replies
BIBLEDIT-RDWRT(1)					      General Commands Manual						 BIBLEDIT-RDWRT(1)

NAME
bibledit-rdwrt - Read or writes data to or from a Bibledit-Gtk Bible or project DESCRIPTION
Bibledit-rdwrt can read from or write to Bible data. Syntax: bibledit-rdwrt -r|-w project book chapter|0 fileName Breaking the syntax down we have: First parameter: -r|-w This can be either -r or -w which determines whether the remaining arguments are going to do a "read" operation from the specified Bibledit-Gtk Bible / project, or do a "write" operation to that Bible / project. Second parameter: project This gives the name of the Bibledit-Gtk Bible / project. All we have to do is ensure that the project name we want to access is a valid/existing one. Third parameter: book This is simply the 3-letter book code for the Bible book that is being read/written to. I.e., MAT for Matthew, GEN for Genesis, etc. Fourth parameter: chapter|0 This can be either a chapter number or 0 (zero) for reading/writing either an individual chapter or read- ing/writing a whole book (when the parameter is 0). Fifth parameter: fileName This is a temporary file name that we assign for our use with bibledit-rdwrt. For a read (-r) operation this fileName argument is the name of the file that will be created by bibledit-rdwrt containing a copy of the whole book (corresponding to the 3-letter code), or that contains the individual chapter contents (of a designated chapter) of an existing Bibledit-Gtk book file in the Bible / project. It should be prefixed with a path us. Since bibledit-rdwrt is a console operation, after AdaptIt calls it using ::wxExe- cute, it would need to read the resulting temporary file to grab the contents for its use. For a write (-w) operation this fileName argu- ment is the name of the temporary file that bibledit-rdwrt reads to get the text which it then writes to the appropriate Bible / project file. The temporary file can contain the text of a whole book, or just the text of a single chapter for the book specified by the book 3-letter code and the chapter (number) argument. bibledit-rdwrt may exit with 0 on success, or -1 on failure, as it sees fit. It may write to stdout or stderr, as it sees fit. LICENSE
This program is distributed under the GNU General Public License, as noted in each source file. Version 4.2 August 18 2011 BIBLEDIT-RDWRT(1)
All times are GMT -4. The time now is 10:33 AM.
Unix & Linux Forums Content Copyright 1993-2022. All Rights Reserved.
Privacy Policy