06-12-2008
To find all common lines from 'n' no. of files
Hi,
I have one situation. I have some 6-7 no. of files in one directory & I have to extract all the lines which exist in all these files. means I need to extract all common lines from all these files & put them in a separate file.
Please help. I know it could be done with the help of cut,sort & uniq commands. But it will take more time whenever the script is executed. I want some quick & shortcut method.
I am using ksh shell.
10 More Discussions You Might Find Interesting
1. UNIX for Dummies Questions & Answers
Hi, everyone,
Let's say, we have
xxx.txt
A 1 2 3 4 5
C 1 2 3 4 5
E 1 2 3 4 5
yyy.txt
A 1 2 3 4 5
B 1 2 3 4 5
C 1 2 3 4 5
D 1 2 3 4 5
E 1 2 3 4 5
First I match the first column I find intersection (A,C, E), then I want to take those lines with ACE out from yyy.txt, like
A 1... (11 Replies)
Discussion started by: kaixinsjtu
11 Replies
2. UNIX for Dummies Questions & Answers
Hi.
If we have this file
A B C
7 8 9
1 2 10
and this other file
A C D F
7 9 2 3
9 2 3 4
The result i´m looking for is intersection with A B C D F
so the answer here will be (10 Replies)
Discussion started by: alcalina
10 Replies
3. Shell Programming and Scripting
Hello guys,
I need a script to get the common lines from two files with a criteria that if the first two columns match then I keep the maximum value of the 3rd column.(tab separated columns)
Sample input:
file1:
111 222 0.1
333 444 0.5
555 666 0.4
file 2:
111 222 0.7
555 666... (5 Replies)
Discussion started by: jaysean
5 Replies
4. Shell Programming and Scripting
Hello guys,
I need a script to get the common lines from two files with a criteria that if the first two columns match then I keep the maximum value of the 5th column.(tab separated columns) . 3rd and 4th columns corresponds to the row which has highest value for the 5th column.
Sample... (2 Replies)
Discussion started by: jaysean
2 Replies
5. Shell Programming and Scripting
FileA
chr1 31237964 NP_001018494.1 PUM1 M340L
chr1 31237964 NP_055491.1 PUM1 M340L
chr1 33251518 NP_037543.1 AK2 H191D
chr1 33251518 NP_001616.1 AK2 H191D
chr1 57027345 NP_001004303.2 C1orf168 P270S
FileB
chr1 ... (9 Replies)
Discussion started by: genehunter
9 Replies
6. Shell Programming and Scripting
Hello everyone
A few years Ago the user radoulov posted a fancy solution for a problem, which was about finding common lines (gene variation names) between multiple samples (files). The code was:
awk 'END {
for (R in rec) {
n = split(rec, t, "/")
if (n > 1)
dup = dup ?... (5 Replies)
Discussion started by: bibb
5 Replies
7. UNIX for Dummies Questions & Answers
Thanks everyone. I got that problem solved.
I require one more help here. (Yes, UNIX definitely seems to be fun and useful, and I WILL eventually learn it for myself. But I am now on a different project and don't really have time to go through all the basics. So, I will really appreciate some... (6 Replies)
Discussion started by: latsyrc
6 Replies
8. Shell Programming and Scripting
I want to find common line in two files and replace the next line of first file with the next line of second file. (sed,awk,perl,bash any solution is welcomed ) Case Ignored. Multiple Occurrence of same line.
File 1:
hgacdavd
sndm,ACNMSDC
msgid "Rome"
msgstr ""
kgcksdcgfkdsb... (4 Replies)
Discussion started by: madira
4 Replies
9. Shell Programming and Scripting
Hi! I would like to comm -12 with one file and with all of the files in another folder that has a 100 files or more (that file is not in that folder) to find common text lines. I would like to have each case that they have common lines to be written to a different output file and the names of the... (6 Replies)
Discussion started by: Eve
6 Replies
10. Shell Programming and Scripting
Could it be possible to find common lines between all of the files in one folder? Just like comm -12 . So all of the files two at a time. I would like all of the outcomes to be written to a different files, and the file names could be simply numbers - 1 , 2 , 3 etc. All of the file names contain... (19 Replies)
Discussion started by: Eve
19 Replies
uniq(1) General Commands Manual uniq(1)
Name
uniq - report repeated lines in a file
Syntax
uniq [-udc[+n][-n]] [input[output]]
Description
The command reads the input file comparing adjacent lines. In the normal case, the second and succeeding copies of repeated lines are
removed; the remainder is written on the output file. Note that repeated lines must be adjacent in order to be found. For further infor-
mation, see
Options
The n arguments specify skipping an initial portion of each line in the comparison:
-n Skips specified number of fields. A field is defined as a string of non-space, non-tab characters separated by tabs and spaces from its
neighbors.
+n Skips specified number of characters in addition to fields. Fields are skipped before characters.
-c Displays number of repetitions, if any, for each line.
-d Displays only lines that were repeated.
-u Displays only unique (nonrepeated) lines.
If the -u flag is used, just the lines that are not repeated in the original file are output. The -d option specifies that one copy of
just the repeated lines is to be written. The normal mode output is the union of the -u and -d mode outputs.
The -c option supersedes -u and -d and generates an output report in default style but with each line preceded by a count of the number of
times it occurred.
See Also
comm(1), sort(1)
uniq(1)