Combine two text files


 
Thread Tools Search this Thread
Top Forums Shell Programming and Scripting Combine two text files
# 8  
Old 09-15-2015
Yes, exactly.

---------- Post updated 09-15-15 at 05:31 AM ---------- Previous update was 09-14-15 at 07:25 PM ----------

The number of lines will be same.

Last edited by my_Perl; 09-15-2015 at 07:30 AM..
# 9  
Old 09-15-2015
Here is my generic XML script which I use for such things...

Code:
$ cat xml.awk

BEGIN {
        FS=">"; OFS=">";
        RS="<"; ORS="<"
}

# These should be special variables for match() but aren't.
function rbefore(STR)   { return(substr(STR, N, RSTART-1)); }# before match
function rmid(STR)      { return(substr(STR, RSTART, 1)); }  # First char match
function rall(STR)      { return(substr(STR, RSTART, RLENGTH)); }# Entire match
function rafter(STR)    { return(substr(STR, RSTART+RLENGTH)); }# after match

function aquote(OUT, A, PFIX, TA) { # Turns Q SUBSEP R into A[PFIX":"Q]=R
        if(OUT)
        {
                if(PFIX) PFIX=PFIX":"
                split(OUT, TA, SUBSEP);
                A[toupper(PFIX) toupper(TA[1])]=TA[2];
        }

        return("");
}

# Intended to be less stupid about quoted text in XML/HTML.
# Splits a='b' c='d' e='f' into A[PFIX":"a]=b, A[PFIX":"c]=d, etc.
function qsplit(STR, A, PFIX, X, OUT) {
        while(STR && match(STR, /([ \n\t]+)|[\x27\x22=]/))
        {
                OUT = OUT rbefore(STR);
                RMID=rmid(STR);

                if((RMID == "'") || (RMID == "\""))     # Quote characters
                {
                        if(!Q)          Q=RMID;         # Begin quote section
                        else if(Q == RMID)      Q="";   # End quote section
                        else                    OUT = OUT RMID; # Quoted quote
                } else if(RMID == "=") {
                        if(Q)   OUT=OUT RMID; else OUT=OUT SUBSEP;
                } else if((RMID=="\r")||(RMID=="\n")||(RMID=="\t")||(RMID==" ")) {
                        if(Q)   OUT = OUT rall(STR); # Literal quoted whitespace
                        else    OUT = aquote(OUT, A, PFIX); # Unquoted WS, next block
                }
                STR=rafter(STR); # Strip off the text we've processed already.
        }

        aquote(OUT STR, A, PFIX); # Process any text we haven't already.
}


{ SPEC=0 ; TAG="" }

NR==1 {
        if(ORS == RS) print;
        next } # The first "line" is blank when RS=<

/^[!?]/ { SPEC=1    }   # XML specification junk

# Handle open-tags
match($1, /^[^\/ \r\n\t>]+/) {
        TAG=substr(toupper($1), RSTART, RLENGTH);
        if((!SPEC) && !($1 ~ /\/$/))
        {
                TAGS=TAG "%" TAGS;
                DEP++;
                LTAGS=TAGS
        }

        for(X in ARGS) delete ARGS[X];

        qsplit(rafter($1), ARGS);
}

# Handle close-tags
(!SPEC) && /^[\/]/ {
        sub(/^\//, "", $1);
        LTAGS=TAGS
#        sub("^.*" toupper($1) "%", "", TAGS);
        sub("^" toupper($1) "%", "", TAGS);
        $1="/"$1
        DEP=split(TAGS, TA, "%")-1;
        if(DEP < 0) DEP=0;
}

$

It grabs and processes tags and cdata one by one into variables. In this case I use it like:

Code:
$ awk -f xml.awk -e '(TAGS ~ /^TEXT/) {
        printf("{{Text_ID=%s}}{{From=%s}}\n",ARGS["TEXT_ID"],ARGS["FROM"]);
        N=gsub(/\n/, "")+1;
        for(M=1; M<=N; M++) { RS="\n"; getline < "altinput.txt" ; RS="<" ; print } ; print "<ENDTEXT>" }' ORS="\n" OFS=" " text.xml

{{Text_ID=10155645315851111_10155645333076543}}{{From=460350337461111}}
This is the first text
<ENDTEXT>
{{Text_ID=10155645315851111_10155645317023456}}{{From=1626711840902323}}
This is the second text
<ENDTEXT>
{{Text_ID=10155645315851111_10155645320006543}}{{From=1481727095384343}}
This is the third text
If counted
GOT IT... ����
<ENDTEXT>
{{Text_ID=}}{{From=}}
This is the fourth text ........ This is different.
<ENDTEXT>

$

This User Gave Thanks to Corona688 For This Post:
# 10  
Old 09-15-2015
Thanks a lot for beautiful work.

But, I don't understand why I am getting

Code:
{{Text_ID=}}{{From=}}
This is the fourth text ........ This is different.
<ENDTEXT>

instead of

Code:
{{Text_ID=10155645315851111_10155645326222345}}{{FROM=411021195696789}}
This is the fourth text ........ This is different.
<ENDTEXT>


Last edited by my_Perl; 09-15-2015 at 09:12 PM.. Reason: Editing
# 11  
Old 09-15-2015
I apologize for the bug, I had a variable naming conflict.

Code:
$ awk -f xml.awk -e '(TAGS ~ /^TEXT/) {
        printf("{{Text_ID=%s}}{{From=%s}}\n",ARGS["TEXT_ID"],ARGS["FROM"]);
        NN=$0 ; NN=gsub(/\n/, "",NN)+1;
        for(MM=1; MM<=NN; MM++) { RS="\n"; getline < "altinput.txt" ; RS="<" ; print } ; print "<ENDTEXT>" }' ORS="\n" OFS=" " text.xml
{{Text_ID=10155645315851111_10155645333076543}}{{From=460350337461111}}
This is the first text
<ENDTEXT>
{{Text_ID=10155645315851111_10155645317023456}}{{From=1626711840902323}}
This is the second text
<ENDTEXT>
{{Text_ID=10155645315851111_10155645320006543}}{{From=1481727095384343}}
This is the third text
If counted
GOT IT... ����
<ENDTEXT>
{{Text_ID=10155645315851111_10155645326222345}}{{From=411021195696789}}
This is the fourth text ........ This is different.
<ENDTEXT>

$


Last edited by Corona688; 09-16-2015 at 03:34 PM..
This User Gave Thanks to Corona688 For This Post:
# 12  
Old 09-15-2015
I sincerely appreciate.
# 13  
Old 11-01-2015
I need help to understand if I want to use only FIRST.txt file to produce OUTPUT.txt without using the SECOND.txt. What changes should I make?

Thanks in advance. Smilie
# 14  
Old 11-02-2015
Not sure I'm understanding Corona688's proposal to its entirety, but you could try replacing RS="\n"; getline < "altinput.txt" ; RS="<" with gsub (/<[^>]*>/, _). NO warranty!
Login or Register to Ask a Question

Previous Thread | Next Thread

10 More Discussions You Might Find Interesting

1. UNIX for Beginners Questions & Answers

Automate splitting of files , scp files as each split completes and combine files on target server

i use the split command to split a one terabyte backup file into 10 chunks of 100 GB each. The files are split one after the other. While the files is being split, I will like to scp the files one after the other as soon as the previous one completes, from server A to Server B. Then on server B ,... (2 Replies)
Discussion started by: malaika
2 Replies

2. Shell Programming and Scripting

Combine Two Text Files (PERMUTE)

Hello everybody, I would like to know how can I obtain this: There are two text files.....ffirst.txt and fsecond.txt ffirst.txt contains 5 lines (example): A B C D E fsecond.txt contains 10 lines (example): 1 2 3 4 5 6 (4 Replies)
Discussion started by: gandrinno1
4 Replies

3. Shell Programming and Scripting

Combine the lines from separate text files

Hi All, I have three separate text files which has only one line and i want to combine these lines in one text file which will have three lines. cat file1.txt abc cat file2.txt 1265 6589 1367 cat file3.txt 0.98 0.36 0.5 So, I want to see these three lines in the... (9 Replies)
Discussion started by: senayasma
9 Replies

4. Shell Programming and Scripting

conditionally combine text from two files into one

Hi! I'm trying to take multiple text files (6), which have text on some lines but not others, and combine them. I'd also like to make the values in one column of some of the files (files 4-6) negative. I'm trying to write a short script (see below) as I have to do this with a large number of... (2 Replies)
Discussion started by: felix.echidna
2 Replies

5. UNIX for Dummies Questions & Answers

how to combine text files

how to combine text files in new file and be separated by commas : example: f1.txt 1 2 3 f2.txt x y z f3.txt m n o i need output to be fnew.txt 1,x,m (9 Replies)
Discussion started by: takyeldin
9 Replies

6. Shell Programming and Scripting

Combine Multiple text or csv files column-wise

Hi All I am trying to combine columns from multiple text files into a single file using paste command but the record length being unequal in the different files the data is running over to the closest empty cell on the left. Please see below. What can i do to resolve this ? File 1 File... (15 Replies)
Discussion started by: venky_ibm
15 Replies

7. Shell Programming and Scripting

how to combine 2 lines in same files based on any text

hi, I want to combine two lines in same file. If the line ends with '&' it should belongs to previous line only Here i am writing example. Ex1: line 1 : return abcdefgh& line 2 : ijklmnopqr& line 3 : stuvw& line 4 : xyz output should be line 1: return abcdefghijklmnopqrstuvwxyz ... (11 Replies)
Discussion started by: spc432
11 Replies

8. Shell Programming and Scripting

How to Merge / combine / join / paste 2 text files side-by-side

I have 2 text files, both have one simple, single column. The 2 files might be the same length, or might not, and if not, it's unknown which one would be longer. For this example, file1 is longer: ---file1 Joe Bob Mary Sally Fred Elmer David ---file2 Tomato House Car... (3 Replies)
Discussion started by: cajunfries
3 Replies

9. UNIX for Dummies Questions & Answers

combine text files into one file

I need to write a shell script which combines/joins 3 text files into one file. Do i put the txt files in the same folder as my script? Here is what i have: #!/bin/bash file1=$1 file2=$2 file3=$3 out="output.txt" count=0 if then echo "$(basename $0) file1 file2 file3" ... (3 Replies)
Discussion started by: zzthejimzz
3 Replies

10. Shell Programming and Scripting

How to combine text data into one line?

The following input needs to be manipulated as follows: INPUT from file or results of command: ============start: Medium identifier : a45c0213:47eb5485:0aec:0321 Medium label : SQL Disk_11516 Location : Protected : None ... (2 Replies)
Discussion started by: rcky_mntere
2 Replies
Login or Register to Ask a Question