Sponsored Content
Top Forums Shell Programming and Scripting Save files in directory as txt Post 302928539 by junior-helper on Friday 12th of December 2014 12:14:13 PM
Old 12-12-2014
You know you have just renamed a html file to a txt file, don't you?
One can hardly convert a html file to a txt file, I mean in a way that the html tags disappear (yes, you can parse it with sed, but it's not recommended)

What do you think about this (yes, it looks complicated, but it might be a way better solution)...

In the download folder;
(Make sure there are only files downloaded from link.txt, just in case...)

Code:
awk '/pdf/ {
    gsub(/^.*href = "|".*/,"",$0)
    print FILENAME,$0 >> "/tmp/tcode-pdf.txt"
    print "http://geneticslab.emory.edu/tests/"$0 >> "/tmp/list2.txt"
}' *

The above awk
  • extracts the "path" to the download link for the appropriate pdf file.
  • creates a file tcode-pdf.txt with testcode-pdfname pairs (later, this is used in the renaming process)
  • generates a download list
Code:
wget -x -i /tmp/list2.txt

This time, wget will download PDFs

Code:
awk '{ A[$1]=$2; next} END { for (i in A) print "mv \x27"A[i]"\x27",i".pdf" }' /tmp/tcode-pdf.txt | sh

This awk command will generate commands (and execute them) to rename the cryptic filename of the pdf to testcode.pdf
E.g. test-pdf.php?testid=4125 to MM123.pdf

Code:
for i in *.pdf; do
 pdftotext "$i"
done

convert pdfs to txt files.

I've experimented with one test-code and the output looks very viable Smilie

Last edited by junior-helper; 12-12-2014 at 06:33 PM.. Reason: substituted "pdftotext *.pdf" with a for loop
 

10 More Discussions You Might Find Interesting

1. Shell Programming and Scripting

Read from fileList.txt, copy files from directory tree

Hi, hopefully this is a fairly simple Q&A. I have a clean file list of approximately 180 filenames with no directory or slashes in front of the filename nor any extension or dot ".". I would like to read from this list, find these files recursively down through directory trees, copy the files... (1 Reply)
Discussion started by: fxvisions
1 Replies

2. UNIX for Dummies Questions & Answers

List all files except *.txt in a directory

I have many types of files (Eg: *.log, *.rpt, *.txt, *.dat) in a directory. I want to display all file types except *.txt. What is the command to display all files except "*.txt" (9 Replies)
Discussion started by: apsprabhu
9 Replies

3. Shell Programming and Scripting

Checking if the files in a directory have a txt extension

foreach file ($dir1/*) if ($file ~ *.txt) then echo "Skipping $file (is a txt file)" endif end that should work right guys? :confused: (15 Replies)
Discussion started by: pantelis
15 Replies

4. Shell Programming and Scripting

moving the files in a.txt files to a different directory

HI All, I am coding a shell script which will pick all the .csv files in a particular directoryand write it in to a .txt file, this .txt file i will use as a source in datastage for processing. now after the processing is done I have to move and archive all the files in the .txt file to a... (5 Replies)
Discussion started by: subhasri_2020
5 Replies

5. Shell Programming and Scripting

Pattern search and save it as .txt file with some name..

Hello, I have a note pad at /usr/abc location with the following content, since it is a huge file i need to split it into multiple .txt files. A123|akdhj |21kjsdff |b212b1b21 |0 A123asdasd |assdd |asdasdsdqw|6 A123|QEWQ |NMTGHJK |zxczxczx|3 A123|GEGBGH |RTYBN ... (15 Replies)
Discussion started by: j_panky
15 Replies

6. Shell Programming and Scripting

I need to back up a bunch of files on a directory and save that file as the current date....

this is what i have to find the files modified within the past 24 hours find . -mtime -1 -type f -print0 | xargs -0 tar rvf "$archive.tar" however i need to save/name this archive as the current date (MM-DD,YYYY.tar.gz) how do i doo this (1 Reply)
Discussion started by: bugenhagen_
1 Replies

7. Shell Programming and Scripting

Get the input from user and save it as .txt file

Hi friends, I am pretty new to shell scripting, please help me in this Scenario. for example, If I have one file called input.txt once I run the script, 1.It has to delete the old input.txt and create the new input.txt (if old input.txt is not there, no offence, just it has to create a... (2 Replies)
Discussion started by: Padmanabhan
2 Replies

8. Shell Programming and Scripting

Cpio all *.txt-files out of folders to just one directory

I need a hint for reading manpage (I did rtfm really) of cpio to do this task as in the headline described. I want to put all files of a certain type, lets say all *.txt files or any other format. Spread in more than hundreds of subdirectories in one directory I would like to select them and just... (3 Replies)
Discussion started by: 1in10
3 Replies

9. UNIX for Beginners Questions & Answers

How can i add each line from a txt file to different files in the same directory?

Hello, this is my first thread here :) So i have a text file that contains words in each line like abcd efgh ijkl mnop and i have 4 txt files, i want to add each line to each file, like file 1 gets abcd at the end; file 2 gets efgh at the end .... I tried with: cat test | while read -r... (6 Replies)
Discussion started by: azaiiez
6 Replies

10. Shell Programming and Scripting

Dig and concatenate all files yesterday then save it to another directory

I dont want to use for loop since it is using a lot of resources especially to a thousand files. Wanting to have a while? or something will find files that has been modifed or created yesteraday. View it. And search for soemthing and save it to a certain folder. for i in `find ./ -mtime... (3 Replies)
Discussion started by: invinzin21
3 Replies
open_jtalk(1)						      General Commands Manual						     open_jtalk(1)

NAME
open_jtalk -- Japanese TTS system SYNOPSIS
open_jtalk [options] [infile] DESCRIPTION
This manual page documents briefly the open_jtalk command. This manual page was written for the Debian distribution because the original program does not have a manual page. Instead, it has docu- mentation in the GNU Info format; see below. open_jtalk is a program that synthesize speech waveform from Japanese texts. It uses HMMs trained by the HMM-based speech synthesis system (HTS). OPTIONS
A summary of options is included below. -x dir dictionary directory -td tree decision tree files for state duration -tm tree Show version of program. -tf tree decision tree files for Log F0 -tl tree decision tree files for low-pass filter -md pdf model files for state duration -mm pdf model files for spectrum -mf pdf model files for Log F0 -ml pdf model files for low-pass filter -dm win window files for calculation delta of spectrum -df win window files for calculation delta of Log F0 -dl win window files for calculation delta of low-pass filter -ow s filename of output wav audio (generated speech) -ot s filename of output trace information -s i sampling frequency [16000][1--48000] -p i frame period (point) [80][1--] -a f all-pass constant [0.42][0.0--1.0] -g i gamma = -1 / i (if i=0 then gamma=0) [0][0--] -b f postfiltering coefficient [0.0][-0.8--8.0] -l regard input as log gain and output linear one (LSP) -u f voiced/unvoiced threshold[0.5][0.0--1.0] -em tree decision tree files for GV of spectrum -ef tree decision tree files for GV of Log F0 -el tree decision tree files for GV of low-pass filter -cm pdf filenames of GV for spectrum -cf pdf filenames of GV for Log F0 -cl pdf filenames of GV for low-pass filter -jm f weight of GV for spectrum [1.0][0.0--2.0] -jf f weight of GV for Log F0 [1.0][0.0--2.0] -jl f weight of GV for low-pass filter [1.0][0.0--2.0] -k tree GV switch -z i audio buffer size [1600][0--48000] infile text file option '-d' may be repeated to use multiple delta parameters. generated spectrum, log F0, and low-pass filter coefficient sequences are saved in natural endian, binary (float) format. EXAMPLE
If you installed hts-voice-nitech-jp-atr503-m001 in the current directory, the following command let you make a voice file from input.txt: % open_jtalk -s 48000 -p 240 -a 0.55 -td tree-dur.inf -tm tree-mgc.inf -tf tree-lf0.inf -tl tree-lpf.inf -md dur.pdf -mm mgc.pdf -mf lf0.pdf -ml lpf.pdf -dm mgc.win1 -dm mgc.win2 -dm mgc.win3 -df lf0.win1 -df lf0.win2 -df lf0.win3 -dl lpf.win1 -em tree-gv-mgc.inf -ef tree-gv-lf0.inf -cm gv-mgc.pdf -cf gv-lf0.pdf -k gv-switch.inf -ow output.wav -x dic_dir input.txt AUTHOR
This manual page was written by Koichi Akabe vbkaisetsu@gmail.com for the Debian system (and may be used by others). Permission is granted to copy, distribute and/or modify this document under the terms of the GNU General Public License, Version 2 any later version published by the Free Software Foundation. On Debian systems, the complete text of the GNU General Public License can be found in /usr/share/common-licenses/GPL. open_jtalk(1)
All times are GMT -4. The time now is 03:50 AM.
Unix & Linux Forums Content Copyright 1993-2022. All Rights Reserved.
Privacy Policy