Sponsored Content
Full Discussion: Common lines from files
Top Forums Shell Programming and Scripting Common lines from files Post 302433203 by jaysean on Tuesday 29th of June 2010 02:20:29 AM
Old 06-29-2010
Question

Thanks for the reply. But the script has some problems. It does not discard the lines that are not common. The output needs to be intersection of the lines(i.e. common to both files) and also compare the value of the 3rd column to show the greatest value.
 

10 More Discussions You Might Find Interesting

1. Shell Programming and Scripting

To find all common lines from 'n' no. of files

Hi, I have one situation. I have some 6-7 no. of files in one directory & I have to extract all the lines which exist in all these files. means I need to extract all common lines from all these files & put them in a separate file. Please help. I know it could be done with the help of... (11 Replies)
Discussion started by: The Observer
11 Replies

2. Shell Programming and Scripting

Drop common lines at head/tail of a large set of files

Hi! I have a large set of pairs of text files (each pair in their own subdirectory) and each pair shares head/tail (a couple of first and last lines) but differs in the middle part. I need to delete the heads/tails and keep only the middle portions in which they differ. The lengths of heads/tails... (1 Reply)
Discussion started by: dobryden
1 Replies

3. Shell Programming and Scripting

Common lines from files

Hello guys, I need a script to get the common lines from two files with a criteria that if the first two columns match then I keep the maximum value of the 5th column.(tab separated columns) . 3rd and 4th columns corresponds to the row which has highest value for the 5th column. Sample... (2 Replies)
Discussion started by: jaysean
2 Replies

4. Shell Programming and Scripting

Get common lines from multiple files

FileA chr1 31237964 NP_001018494.1 PUM1 M340L chr1 31237964 NP_055491.1 PUM1 M340L chr1 33251518 NP_037543.1 AK2 H191D chr1 33251518 NP_001616.1 AK2 H191D chr1 57027345 NP_001004303.2 C1orf168 P270S FileB chr1 ... (9 Replies)
Discussion started by: genehunter
9 Replies

5. Shell Programming and Scripting

Find common lines between multiple files

Hello everyone A few years Ago the user radoulov posted a fancy solution for a problem, which was about finding common lines (gene variation names) between multiple samples (files). The code was: awk 'END { for (R in rec) { n = split(rec, t, "/") if (n > 1) dup = dup ?... (5 Replies)
Discussion started by: bibb
5 Replies

6. UNIX for Dummies Questions & Answers

Filter lines common in two files

Thanks everyone. I got that problem solved. I require one more help here. (Yes, UNIX definitely seems to be fun and useful, and I WILL eventually learn it for myself. But I am now on a different project and don't really have time to go through all the basics. So, I will really appreciate some... (6 Replies)
Discussion started by: latsyrc
6 Replies

7. Shell Programming and Scripting

Finding out the common lines in two files using 4 fields with the help of awk and UNIX

Dear All, I have 2 files. If field 1, 2, 4 and 5 matches in both file1 and file2, I want to print the whole line of file1 and file2 one after another in my output file. File1: sc2/80 20 . A T 86 F=5;U=4 sc2/60 55 . G T ... (1 Reply)
Discussion started by: NamS
1 Replies

8. Shell Programming and Scripting

Find common lines with one file and with all of the files in another folder

Hi! I would like to comm -12 with one file and with all of the files in another folder that has a 100 files or more (that file is not in that folder) to find common text lines. I would like to have each case that they have common lines to be written to a different output file and the names of the... (6 Replies)
Discussion started by: Eve
6 Replies

9. Shell Programming and Scripting

Find common lines between all of the files in one folder

Could it be possible to find common lines between all of the files in one folder? Just like comm -12 . So all of the files two at a time. I would like all of the outcomes to be written to a different files, and the file names could be simply numbers - 1 , 2 , 3 etc. All of the file names contain... (19 Replies)
Discussion started by: Eve
19 Replies

10. UNIX for Beginners Questions & Answers

Awk: output lines with common field to separate files

Hi, A beginner one. my input.tab (tab-separated): h1 h2 h3 h4 h5 item1 grpA 2 3 customer1 item2 grpB 4 6 customer1 item3 grpA 5 9 customer1 item4 grpA 0 0 customer2 item5 grpA 9 1 customer2 objective: output a file for each customer ($5) with the item number ($1) only if $2 matches... (2 Replies)
Discussion started by: beca123456
2 Replies
DIFF(1) 							     GNU Tools								   DIFF(1)

NAME
diff - find differences between two files SYNOPSIS
diff [options] from-file to-file DESCRIPTION
In the simplest case, diff compares the contents of the two files from-file and to-file. A file name of - stands for text read from the standard input. As a special case, diff - - compares a copy of standard input to itself. If from-file is a directory and to-file is not, diff compares the file in from-file whose file name is that of to-file, and vice versa. The non-directory file must not be -. If both from-file and to-file are directories, diff compares corresponding files in both directories, in alphabetical order; this compari- son is not recursive unless the -r or --recursive option is given. diff never compares the actual contents of a directory as if it were a file. The file that is fully specified may not be standard input, because standard input is nameless and the notion of ``file with the same name'' does not apply. diff options begin with -, so normally from-file and to-file may not begin with -. However, -- as an argument by itself treats the remain- ing arguments as file names even if they begin with -. Options Below is a summary of all of the options that GNU diff accepts. Most options have two equivalent names, one of which is a single letter preceded by -, and the other of which is a long name preceded by --. Multiple single letter options (unless they take an argument) can be combined into a single command line word: -ac is equivalent to -a -c. Long named options can be abbreviated to any unique prefix of their name. Brackets ([ and ]) indicate that an option takes an optional argument. -lines Show lines (an integer) lines of context. This option does not specify an output format by itself; it has no effect unless it is combined with -c or -u. This option is obsolete. For proper operation, patch typically needs at least two lines of context. -a Treat all files as text and compare them line-by-line, even if they do not seem to be text. -b Ignore changes in amount of white space. -B Ignore changes that just insert or delete blank lines. --brief Report only whether the files differ, not the details of the differences. -c Use the context output format. -C lines --context[=lines] Use the context output format, showing lines (an integer) lines of context, or three if lines is not given. For proper operation, patch typically needs at least two lines of context. --changed-group-format=format Use format to output a line group containing differing lines from both files in if-then-else format. -d Change the algorithm to perhaps find a smaller set of changes. This makes diff slower (sometimes much slower). -D name Make merged if-then-else format output, conditional on the preprocessor macro name. -e --ed Make output that is a valid ed script. --exclude=pattern When comparing directories, ignore files and subdirectories whose basenames match pattern. --exclude-from=file When comparing directories, ignore files and subdirectories whose basenames match any pattern contained in file. --expand-tabs Expand tabs to spaces in the output, to preserve the alignment of tabs in the input files. -f Make output that looks vaguely like an ed script but has changes in the order they appear in the file. -F regexp In context and unified format, for each hunk of differences, show some of the last preceding line that matches regexp. --forward-ed Make output that looks vaguely like an ed script but has changes in the order they appear in the file. -h This option currently has no effect; it is present for Unix compatibility. -H Use heuristics to speed handling of large files that have numerous scattered small changes. --horizon-lines=lines Do not discard the last lines lines of the common prefix and the first lines lines of the common suffix. -i Ignore changes in case; consider upper- and lower-case letters equivalent. -I regexp Ignore changes that just insert or delete lines that match regexp. --ifdef=name Make merged if-then-else format output, conditional on the preprocessor macro name. --ignore-all-space Ignore white space when comparing lines. --ignore-blank-lines Ignore changes that just insert or delete blank lines. --ignore-case Ignore changes in case; consider upper- and lower-case to be the same. --ignore-matching-lines=regexp Ignore changes that just insert or delete lines that match regexp. --ignore-space-change Ignore changes in amount of white space. --initial-tab Output a tab rather than a space before the text of a line in normal or context format. This causes the alignment of tabs in the line to look normal. -l Pass the output through pr to paginate it. -L label --label=label Use label instead of the file name in the context format and unified format headers. --left-column Print only the left column of two common lines in side by side format. --line-format=format Use format to output all input lines in in-then-else format. --minimal Change the algorithm to perhaps find a smaller set of changes. This makes diff slower (sometimes much slower). -n Output RCS-format diffs; like -f except that each command specifies the number of lines affected. -N --new-file In directory comparison, if a file is found in only one directory, treat it as present but empty in the other directory. --new-group-format=format Use format to output a group of lines taken from just the second file in if-then-else format. --new-line-format=format Use format to output a line taken from just the second file in if-then-else format. --old-group-format=format Use format to output a group of lines taken from just the first file in if-then-else format. --old-line-format=format Use format to output a line taken from just the first file in if-then-else format. -p Show which C function each change is in. -P When comparing directories, if a file appears only in the second directory of the two, treat it as present but empty in the other. --paginate Pass the output through pr to paginate it. -q Report only whether the files differ, not the details of the differences. -r When comparing directories, recursively compare any subdirectories found. --rcs Output RCS-format diffs; like -f except that each command specifies the number of lines affected. --recursive When comparing directories, recursively compare any subdirectories found. --report-identical-files -s Report when two files are the same. -S file When comparing directories, start with the file file. This is used for resuming an aborted comparison. --from-file=file Compare file to all operands. file can be a directory. --to-file=file Compare all operands to file. file can be a directory. --sdiff-merge-assist Print extra information to help sdiff. sdiff uses this option when it runs diff. This option is not intended for users to use directly. --show-c-function Show which C function each change is in. --show-function-line=regexp In context and unified format, for each hunk of differences, show some of the last preceding line that matches regexp. --side-by-side Use the side by side output format. --speed-large-files Use heuristics to speed handling of large files that have numerous scattered small changes. --starting-file=file When comparing directories, start with the file file. This is used for resuming an aborted comparison. --suppress-common-lines Do not print common lines in side by side format. -t Expand tabs to spaces in the output, to preserve the alignment of tabs in the input files. -T Output a tab rather than a space before the text of a line in normal or context format. This causes the alignment of tabs in the line to look normal. --text Treat all files as text and compare them line-by-line, even if they do not appear to be text. -u Use the unified output format. --unchanged-group-format=format Use format to output a group of common lines taken from both files in if-then-else format. --unchanged-line-format=format Use format to output a line common to both files in if-then-else format. --unidirectional-new-file When comparing directories, if a file appears only in the second directory of the two, treat it as present but empty in the other. -U lines --unified[=lines] Use the unified output format, showing lines (an integer) lines of context, or three if lines is not given. For proper operation, patch typically needs at least two lines of context. -v --version Output the version number of diff. -w Ignore white space when comparing lines. -W columns --width=columns Use an output width of columns in side by side format. -x pattern When comparing directories, ignore files and subdirectories whose basenames match pattern. -X file When comparing directories, ignore files and subdirectories whose basenames match any pattern contained in file. -y Use the side by side output format. SEE ALSO
cmp(1), comm(1), diff3(1), ed(1), patch(1), pr(1), sdiff(1). DIAGNOSTICS
An exit status of 0 means no differences were found, 1 means some differences were found, and 2 means trouble. GNU Tools 22sep1993 DIFF(1)
All times are GMT -4. The time now is 02:19 AM.
Unix & Linux Forums Content Copyright 1993-2022. All Rights Reserved.
Privacy Policy