Compare two text file and output the same to third file


 
Thread Tools Search this Thread
Top Forums Shell Programming and Scripting Compare two text file and output the same to third file
# 1  
Old 02-24-2012
Compare two text file and output the same to third file

Need help.

I have a source file that listed sets of numbers/words and what i'm trying to do is, by each line of the source file i want to look for same numbers/words in second file and if it match then write it to third file. Third file should have whole line from source file plus whole line that matches from second file and its separated by space/tap.

Any idea how to do it in perl?

ex:

Source file
Code:
123-23456-2341
783-ueu72-k812

second file:
Code:
linenumber1-sometext:123-23456-2341:another text
linenumber2--sometext:637-7288-ju18:some other text
linenumber3--sometext:783-ueu72-k812:some more text


result expected:
123-23456-2341 linenumber1-sometext:123-23456-2341:another text

783-ueu72-k812 linenumber3--sometext:783-ueu72-k812:some more text

Last edited by Franklin52; 02-26-2012 at 05:11 AM.. Reason: Please use code tags for data and code samples, thank you
# 2  
Old 02-24-2012
Code:
$ grep -f file1.txt file2.txt
linenumber1-sometext:123-23456-2341:another text
linenumber3--sometext:783-ueu72-k812:some more text

in solaris, use the below grep

/usr/xpg4/bin/grep
# 3  
Old 02-24-2012
How do i link the string found in first file to the mathing line in second file?

second file consist of many other string including the string in first file.
# 4  
Old 02-24-2012
how big is your file ?

if it is small, then the below will work

Code:
$ while read a; do grep -w "$a" file2.txt > /dev/null && echo -n "$a " && grep -w "$a" file2.txt; done < file1.txt                                  
123-23456-2341 linenumber1-sometext:123-23456-2341:another text
783-ueu72-k812 linenumber3--sometext:783-ueu72-k812:some more text

This User Gave Thanks to itkamaraj For This Post:
# 5  
Old 02-24-2012
It's outputing as expected but when i try with the actual file i'm getting error "grep: memory exhausted".

The file have about 2000 lines and the file size is 150kb.
# 6  
Old 02-24-2012
Hi.

Using your sample data files, here is an example of the use of join, which is unlikely to run into memory limits. There is a lot of set-up and comparison code, so just look at the heart of the computation, the join:
Code:
#!/usr/bin/env bash

# @(#) s1	Demonstrate join.

# Section 1, setup, pre-solution.
# Infrastructure details, environment, debug commands for forum posts. 
# Uncomment export command to run script as external user.
# export PATH="/usr/local/bin:/usr/bin:/bin"
set +o nounset
pe() { for _i;do printf "%s" "$_i";done; printf "\n"; }
pl() { pe;pe "-----" ;pe "$*"; }
LC_ALL=C ; LANG=C ; export LC_ALL LANG
db() { ( printf " db, ";for _i;do printf "%s" "$_i";done;printf "\n" ) >&2 ; }
db() { : ; }
edges() { local _f _n _l;: ${1?"edges: need file"}; _f=$1;_l=$(wc -l $_f);
  head -${_n:=3} $_f ; pe "--- ( $_l: lines total )" ; tail -$_n $_f ; }
C=$HOME/bin/context && [ -f $C ] && $C join
set -o nounset

FILE1=${1-data1}
shift
FILE2=${1-data2}

# Sample data files.
pe
specimen $FILE1 $FILE2 expected-output.txt
# edges $FILE1 3
# pe
# edges $FILE2 3
# pe
# edges expected-output.txt 3

# Section 2, solution.
pl " Results:"
db " Section 2: solution."
join -t: -1 1 -2 2 -o 2.2,2.1,2.2,2.3 <( sort -t: $FILE1 ) <( sort -t: -k2,2 $FILE2 ) |
sed -e 's/:/ /' |	# first separator only
tee f1


# Section 3, post-solution, check results, clean-up, etc.
v1=$(wc -l <expected-output.txt)
v2=$(wc -l < f1)
pl " Comparison of $v2 created lines with $v1 lines of desired results:"
db " Section 3: validate generated calculations with desired results."

pl " Comparison with desired results:"
if [ ! -f expected-output.txt -o ! -s expected-output.txt ]
then
  pe " Comparison file \"expected-output.txt\" zero-length or missing."
  exit
fi
if cmp expected-output.txt f1
then
  pe " Succeeded -- files have same content."
else
  pe " Failed -- files not identical -- detailed comparison follows."
  if diff -b expected-output.txt f1
  then
    pe " Succeeded by ignoring whitespace differences."
  fi
fi

exit 0

producing:
Code:
% ./s1

Environment: LC_ALL = C, LANG = C
(Versions displayed with local utility "version")
OS, ker|rel, machine: Linux, 2.6.26-2-amd64, x86_64
Distribution        : Debian GNU/Linux 5.0.8 (lenny) 
bash GNU bash 3.2.39
join (GNU coreutils) 6.10

Whole: 5:0:5 of 2 lines in file "data1"
123-23456-2341
783-ueu72-k812

Whole: 5:0:5 of 3 lines in file "data2"
linenumber1-sometext:123-23456-2341:another text
linenumber2--sometext:637-7288-ju18:some other text
linenumber3--sometext:783-ueu72-k812:some more text

Whole: 5:0:5 of 2 lines in file "expected-output.txt"
123-23456-2341 linenumber1-sometext:123-23456-2341:another text
783-ueu72-k812 linenumber3--sometext:783-ueu72-k812:some more text

-----
 Results:
123-23456-2341 linenumber1-sometext:123-23456-2341:another text
783-ueu72-k812 linenumber3--sometext:783-ueu72-k812:some more text

-----
 Comparison of 2 created lines with 2 lines of desired results:

-----
 Comparison with desired results:
 Succeeded -- files have same content.

The join utility requires files ordered on the join field. This means that the result might be out of order with respect to the original positions of lines in the file. If so, you will need to re-order, say on the line number field.

See man join and info coreutils join for details.

Best wishes ... cheers, drl
# 7  
Old 02-26-2012
it doesn't seems to give any result. on screen i get below message. it shows all the other characters except the wanted result.
Code:
~/temp$ >try
./try: line 26: specimen: command not found
-----
 Results:
 ::
 ::
 ::
...
...
...
#::
#::
# CEGM45T2-SL9-0::
# CEGM45T2-SP9-0::
# CEGM45T2-SP9-F::


Last edited by Franklin52; 02-27-2012 at 03:27 AM.. Reason: Please use code tags for code and data samples, thank you
Login or Register to Ask a Question

Previous Thread | Next Thread

10 More Discussions You Might Find Interesting

1. Shell Programming and Scripting

Shell script (sh file) logic to compare contents of one file with another file and output to file

Shell script logic Hi I have 2 input files like with file 1 content as (file1) "BRGTEST-242" a.txt "BRGTEST-240" a.txt "BRGTEST-219" e.txt File 2 contents as fle(2) "BRGTEST-244" a.txt "BRGTEST-244" b.txt "BRGTEST-231" c.txt "BRGTEST-231" d.txt "BRGTEST-221" e.txt I want to get... (22 Replies)
Discussion started by: pottic
22 Replies

2. Shell Programming and Scripting

Read in search strings from text file, search for string in second text file and output to CSV

Hi guys, I have a text file named file1.txt that is formatted like this: 001 , ID , 20000 002 , Name , Brandon 003 , Phone_Number , 616-234-1999 004 , SSNumber , 234-23-234 005 , Model , Toyota 007 , Engine ,V8 008 , GPS , OFF and I have file2.txt formatted like this: ... (2 Replies)
Discussion started by: An0mander
2 Replies

3. Shell Programming and Scripting

Compare output of UNIX command and match data to text file

I am working on an outage script and I run a command from the command line which tells me the amount of generator failures in my market. The output of this command only gives me three digits to identify the site by. I have a master list of all sites in a separate file, call it list.txt. If my... (7 Replies)
Discussion started by: jbrass
7 Replies

4. UNIX for Dummies Questions & Answers

To compare two files,Output into a new file

Hi Please help me to compare two files and output into a new file file1.txt 15114933 |4001 15291649 |933502 15764675 |4316 15764678 |4316 15761974 |282501 15673104 |933505 15673577 |933505 15673098 |933505 15673096 |933505 15673092 |933505 15760705 ... (13 Replies)
Discussion started by: Ankita Talukdar
13 Replies

5. Shell Programming and Scripting

Compare 2 text file with 1 column in each file and write mismatch data to 3rd file

Hi, I need to compare 2 text files with around 60000 rows and 1 column. I need to compare these and write the mismatch data to 3rd file. File1 - file2 = file3 wc -l file1.txt 58112 wc -l file2.txt 55260 head -5 file1.txt 101214200123 101214700300 101250030067 101214100500... (10 Replies)
Discussion started by: Divya Nochiyil
10 Replies

6. Shell Programming and Scripting

Match list of strings in File A and compare with File B, C and write to a output file in CSV format

Hi Friends, I'm a great fan of this forum... it has helped me tone my skills in shell scripting. I have a challenge here, which I'm sure you guys would help me in achieving... File A has a list of job ids and I need to compare this with the File B (*.log) and File C (extend *.log) and copy... (6 Replies)
Discussion started by: asnandhakumar
6 Replies

7. Shell Programming and Scripting

Dynamic output file generation using a input text file with predefined output format

Hi, I have two files , one file with data file with attributes that need to be sent to another file to generate a predefined format. Example: File.txt AP|{SSHA}VEEg42CNCghUnGhCVg== APVG3|{SSHA}XK|"password" AP3|{SSHA}XK|"This is test" .... etc --------- test.sh has... (1 Reply)
Discussion started by: hudson03051nh
1 Replies

8. Shell Programming and Scripting

Compare two file & output

Hi every body i have a problem need help urgently file 1 (approx 200K entries) aaaaa bbbb cccccc dddd ffff file 2 (approx 2 million entries) aaaaa,1,ee,44,5t,6y, bbbb,3,ff,66,5u,8r, cccccc, ..... dddd, ..... eeeeee, ..... ffff, ...... (5 Replies)
Discussion started by: The_Archer
5 Replies

9. Shell Programming and Scripting

Ping text file of ip addressese and output to text file

I am basically a scripting noob, I have some programming logic, and I wouldn't post here if my 3 hours of searching actually found something. So far this is what I have: " #! /bin/ksh List=./pinglist1.txt cat $List | while read ip do Pingable="" ping $ip -n 2 | awk '/100%/ {print... (11 Replies)
Discussion started by: Lasthitlarry
11 Replies

10. Shell Programming and Scripting

compare file size from a output file from a script

Hi guys, firstly I'm working on SunOS 5.10 Generic_125100-10 sun4u sparc SUNW,Sun-Fire-V240 I've made a script to compress two directory and then send them to an other server via ftp. This is working very well. Inside theis script I decide to log usefull data for troubleshooting in case of... (7 Replies)
Discussion started by: moustik
7 Replies
Login or Register to Ask a Question