Sponsored Content
Top Forums UNIX for Dummies Questions & Answers Find common numbers from two very large files using awk or the like Post 302799543 by Scottie1954 on Friday 26th of April 2013 04:41:37 PM
Old 04-26-2013
Find common numbers from two very large files using awk or the like

I've got two files that each contain a 16-digit number in positions 1-16. The first file has 63,120 entries all sorted numerically. The second file has 142,479 entries, also sorted numerically.

I want to read through each file and output the entries that appear in both. So far I've had no success with comm -12, nor with grep -f. I've had some success wtih sdiff, but it's not entirely accurate as it's missing some matches.

What I need is a script that loops through one file to see if an entry corresponds to the other file, but this is beyond my skills.

I am using sh on hp-ux 11.31, so I can't use nawk or gawk, etc.

Thank you for your assistance.

Last edited by Scottie1954; 04-26-2013 at 05:52 PM..
 

9 More Discussions You Might Find Interesting

1. UNIX for Dummies Questions & Answers

Get un common numbers from two files

Hi, I have two files: abc : 50040 123123 31703 cde: 104 97 50040 123123 31703 36609 50534 (3 Replies)
Discussion started by: jingi1234
3 Replies

2. Shell Programming and Scripting

To find all common lines from 'n' no. of files

Hi, I have one situation. I have some 6-7 no. of files in one directory & I have to extract all the lines which exist in all these files. means I need to extract all common lines from all these files & put them in a separate file. Please help. I know it could be done with the help of... (11 Replies)
Discussion started by: The Observer
11 Replies

3. UNIX for Dummies Questions & Answers

Grep alternative to handle large numbers of files

I am looking for a file with 'MCR0000000716214' in it. I tried the following command: grep MCR0000000716214 * The problem is that the folder I am searching in has over 87000 files and I am getting the following: bash: /bin/grep: Arg list too long Is there any command I can use that can... (6 Replies)
Discussion started by: runnerpaul
6 Replies

4. Shell Programming and Scripting

Drop common lines at head/tail of a large set of files

Hi! I have a large set of pairs of text files (each pair in their own subdirectory) and each pair shares head/tail (a couple of first and last lines) but differs in the middle part. I need to delete the heads/tails and keep only the middle portions in which they differ. The lengths of heads/tails... (1 Reply)
Discussion started by: dobryden
1 Replies

5. UNIX for Advanced & Expert Users

Find common Strings in two large files

Hi , I have a text file in the format DB2: DB2: WB: WB: WB: WB: and a second text file of the format Time=00:00:00.473 Time=00:00:00.436 Time=00:00:00.016 Time=00:00:00.027 Time=00:00:00.471 Time=00:00:00.436 the last string in both the text files is of the... (4 Replies)
Discussion started by: kanthrajgowda
4 Replies

6. Shell Programming and Scripting

finding common numbers (contents) across 2 or 3 files

I have 3 files which are tab delimited and have numbers in it. file 1 1 2 3 4 5 6 7 File 2 3 5 7 8 File 3 1 (4 Replies)
Discussion started by: Lucky Ali
4 Replies

7. Shell Programming and Scripting

Find common numbers and print yes or no

Hi I have 2 files with following data First file, sp|Q676U5|A16L1_HUMAN, Autophagy-related protein 16-1 OS=Homo sapiens GN=ATG16L1 PE=1 SV=2, Maximum coiled-coil residue probability: 0.657 in position 163. Maximum dimeric residue probability: 0.288 in position 163. ... (1 Reply)
Discussion started by: manigrover
1 Replies

8. Shell Programming and Scripting

Find Common Values Across Two Files

Hi All, I have two files like below: File1 MYFILE_28012012_1112.txt|4 MYFILE_28012012_1113.txt|51 MYFILE_28012012_1114.txt|57 MYFILE_28012012_1115.txt|57 MYFILE_28012012_1116.txt|57 MYFILE_28012012_1117.txt|57 File2 MYFILE_28012012_1110.txt|57 MYFILE_28012012_1111.txt|57... (2 Replies)
Discussion started by: angshuman
2 Replies

9. Shell Programming and Scripting

Find common files between two directories

I have two directories Dir 1 /home/sid/release1 Dir 2 /home/sid/release2 I want to find the common files between the two directories Dir 1 files /home/sid/release1>ls -lrt total 16 -rw-r--r-- 1 sid cool 0 Jun 19 12:53 File123 -rw-r--r-- 1 sid cool 0 Jun 19 12:53... (5 Replies)
Discussion started by: sidnow
5 Replies
SORT(1) 						      General Commands Manual							   SORT(1)

NAME
sort - sort a file of ASCII lines SYNOPSIS
sort [-bcdfimnru] [-tc] [-o name] [+pos1] [-pos2] file ... OPTIONS
-b Skip leading blanks when making comparisons -c Check to see if a file is sorted -d Dictionary order: ignore punctuation -f Fold upper case onto lower case -i Ignore nonASCII characters -m Merge presorted files -n Numeric sort order -o Next argument is output file -r Reverse the sort order -t Following character is field separator -u Unique mode (delete duplicate lines) EXAMPLES
sort -nr file # Sort keys numerically, reversed sort +2 -4 file # Sort using fields 2 and 3 as key sort +2 -t: -o out # Field separator is : sort +.3 -.6 # Characters 3 through 5 form the key DESCRIPTION
Sort sorts one or more files. If no files are specified, stdin is sorted. Output is written on standard output, unless -o is specified. The options +pos1 -pos2 use only fields pos1 up to but not including pos2 as the sort key, where a field is a string of characters delim- ited by spaces and tabs, unless a different field delimiter is specified with -t. Both pos1 and pos2 have the form m.n where m tells the number of fields and n tells the number of characters. Either m or n may be omitted. SEE ALSO
comm(1), grep(1), uniq(1). SORT(1)
All times are GMT -4. The time now is 08:12 AM.
Unix & Linux Forums Content Copyright 1993-2022. All Rights Reserved.
Privacy Policy