Alignment tool to join text files in 2 directories to create a parallel corpus
I have two directories called English and Hindi. Each directory contains the same number of files with the only difference being that in the case of the English Directory the tag is
and in the Hindi one the tag is
The file may contain either a single text or more than one text as in the example below.
in the English directory contains 22 lines of which the first four are provided
The same number of lines and in the same order are provided in
in the Hindi directory. The first four are provided by way of sample
In some cases a given file may contain only one line.
What I need is to join the English lines to the corresponding Hindi lines with
as a delimiter
An example of the output of the four lines given above is shown below
Since the number of files in each directory are too many, manual manipulation of the files is difficult. I need an alignment tool which will do the job.
A perl or awk script would be of great help. I do not know how to manipulate directories in Perl or Awk and hence the request
I work in a Windows environment
Many thanks for help.
If you have the same number of lines in each file and have the same number of lines in each corresponding definition, paste can create them on the same line, i.e. it creates one record with line n from each file separated with the delimiter of your choice (default is tab).
Hi Guys,
I want to combine 2 files and and put together in 1 file . See below desired output. Any help will be much appreciated.
FILE AX 2134 101L 12345.00 22222.00 1 10
X 2134 101L 12345.00 22222.00 11 20
X 2134 101L 12345.00 22222.00 21 30
X 2134 111L 77777.00 ... (3 Replies)
Gents,
Please can you help.
I want to create a list which contends the complete patch of the location of some directories with the size of each file.
need to select only .txt file
In this case I am try to find the subdirectories tp1 and tp2 and create the output list.
jd175-1
tp1... (3 Replies)
Can anyone please help me i have 2 text files setup like the one below.
Textfile1:
randomemail1:randompassword1
randomemail2:randompassword2
randomemail3:randompassword3
randomemail4:randompassword4
randomemail5:randompassword5
Textfile2:
randompassword1:randomphrase1... (8 Replies)
Hello guys,
I've got a big corpus (a huge text file in which words are separated by one or several spaces). I would like to know if there is a simple way - using awk for instance - to extract any co-occurrence appearing at least 3times through the whole corpus for a given word. By co-occurrence,... (7 Replies)
Hello everyone,
I work under Ubuntu 11.10 (c-shell)
I need a script to create a new text file whose content is the text of another text files that are in the directory $DIRMAIL at this moment.
I will show you an example:
- On the one hand, there is a directory $DIRMAIL where there are... (1 Reply)
Hi,
I have several files containing experiment measurements per hour (hour_1.txt has measurements for first hour, etc..etc..). I have 720 of these files (i.e. up to hour_720.txt) and i want to create 720 directories and in every one of them i want to copy its associative file (e.g.... (4 Replies)
I need a simple command line executable that allows me to join many wmv files into one output wmv file, preferrably in a simple way like this:
wmvjoin file1.wmv file2.wmv .... > outputfile.wmv
So what I want is the wmv-equivalent of mpgtx
I cannot find it on internet.
Thanks. (2 Replies)