|
google site
|
|||||||
| Forums | Register | Blog | Man Pages | Forum Rules | Links | Albums | FAQ | Users | Calendar | Search | Today's Posts | Mark Forums Read |
| UNIX for Dummies Questions & Answers If you're not sure where to post a UNIX or Linux question, post it here. All UNIX and Linux newbies welcome !! |
![]() |
|
|
Thread Tools | Search this Thread | Display Modes |
|
#1
|
|||
|
|||
|
Several file comparison not uniq or comm command
When comparing several files is there a way to find values unique to each file? File1 Code:
a b c d File2 Code:
a b t File 3 Code:
a c h I want to print d for File1 because it is not in the other two t for File2 h for file3 Many thanks L Last edited by zaxxon; 02-09-2010 at 02:52 AM.. Reason: use code tags please, ty |
| Sponsored Links | ||
|
|
|
#2
|
||||
|
||||
|
Why not uniq or comm? They should be available on all Unixes and Linuxes.
|
|
#3
|
|||
|
|||
|
Assuming that this is small scale with small text files. We can find the unique values in a list of files and then reverse lookup each unique value in the original files. For example: Code:
cat filename*|sort|uniq -u | while read line
do
echo "Unique value found : ${line} : in file"
echo "`grep -lx "${line}" filename*`"
doneLast edited by methyl; 02-09-2010 at 02:27 PM.. Reason: typos |
|
#4
|
||||
|
||||
|
Hi. If I needed to get this done quickly, I would make use of the usual *nix commands. I would add the file name to each line, then manipulate the results so that I had a single file, sort it, collect the lines on which the data items were the same, and then filter for lines which had exactly 2 fields. For example: Code:
#!/usr/bin/env bash
# @(#) s2 Demonstrate solve problem of unique values with collection.
# Infrastructure details, environment, commands for forum posts.
set +o nounset
LC_ALL=C ; LANG=C ; export LC_ALL LANG
echo ; echo "Environment: LC_ALL = $LC_ALL, LANG = $LANG"
echo "(Versions displayed with local utility \"version\")"
c=$( ps | grep $$ | awk '{print $NF}' )
version >/dev/null 2>&1 && s=$(_eat $0 $1) || s=""
[ "$c" = "$s" ] && p="$s" || p="$c"
version >/dev/null 2>&1 && version "=o" $p awk
# set -o nounset
rm -f t1 t2
for file in data*
do
sed "s/$/\t$file/" $file >> t1
done
echo
echo " Sample at beginning & end of $( wc -l < t1) lines in combined data file:"
head -3 t1
echo ...
tail -3 t1
echo
echo " Collector script:"
cat collect
echo
echo " Results for lines with 2 fields:"
sort t1 |
./collect |
tee t2 |
awk ' NF == 2 '
echo
echo " Intermediate file from awk collector script:"
cat t2
exit 0producing for your data: Code:
% ./s2
Environment: LC_ALL = C, LANG = C
(Versions displayed with local utility "version")
OS, ker|rel, machine: Linux, 2.6.26-2-amd64, x86_64
Distribution : Debian GNU/Linux 5.0
GNU bash 3.2.39
GNU Awk 3.1.5
Sample at beginning & end of 10 lines in combined data file:
a data1
b data1
c data1
...
a data3
c data3
h data3
Collector script:
#!/usr/bin/env sh
# @(#) collect Demonstrate collection script, awk.
FILE="$1"
# Use nawk or /usr/xpg4/bin/awk on Solaris.
awk '
BEGIN { FS = OFS = "\t" ; previous = "" ; line = "" ; first = "true"}
first == "true" { first = "false" ; previous = $1 ; line = $0 ; next }
$1 == previous { line = line "\t" $2 ; next }
{ print line ; previous = $1 ; line = $0 }
END { print line }
' $FILE
exit 0
Results for lines with 2 fields:
d data1
h data3
t data2
Intermediate file from awk collector script:
a data1 data2 data3
b data1 data2
c data1 data3
d data1
h data3
t data2The awk script is for this specific instance. If this was going to be a on-going task, I would write a more general multi-file join, and have a self-join mode when only one file was specified. In fact, all the operations could probably be placed into the perl code, so that the data need be touched a minimum of times. Best wishes ... cheers, drl |
| Sponsored Links | ||
|
|
![]() |
| Bookmarks |
| Thread Tools | Search this Thread |
| Display Modes | |
|
|
More UNIX and Linux Forum Topics You Might Find Helpful
|
||||
| Thread | Thread Starter | Forum | Replies | Last Post |
| File Comparison command but ignoring while spaces | Veenak15 | Shell Programming and Scripting | 7 | 07-10-2009 10:21 AM |
| help in comm command | arunkumar_mca | UNIX for Dummies Questions & Answers | 1 | 04-27-2009 11:04 AM |
| comm command | amitrajvarma | Shell Programming and Scripting | 8 | 02-06-2008 04:20 AM |
| comm command in sorting IP | tads98 | UNIX for Advanced & Expert Users | 1 | 05-16-2006 07:41 AM |
| Comm, command help | cowpoke | UNIX for Dummies Questions & Answers | 2 | 10-25-2005 08:26 AM |