Sponsored Content
Full Discussion: Faster search needed
Top Forums Shell Programming and Scripting Faster search needed Post 302671087 by daytripper1021 on Friday 13th of July 2012 04:52:54 AM
Old 07-13-2012
Question Faster search needed

Hope you guys out there can help.

I have 2 files as below:

file 1:

Code:
111,222,333,444,555,666
777,888,999,000,111,222
111,222,333,444,555,888

file 2:
Code:
666,AAA
222,BBB
888,CCC

I want to get the 6th column from file 1 (example, 666) and check in file 2 for the value in the 2nd column (AAA). Then print the file2 value (AAA) at the end of file1. Results should be as below:

result:
Code:
111,222,333,444,555,666,AAA
777,888,999,000,111,222,BBB
111,222,333,444,555,888,CCC

I already have a code for this but I found it to be slow (file1 has about a million lines while file2 has about 20,000 lines). I think there should be a faster way of doing this. See below for the code I did:

Code:
for line in `cat file2`
do
  cellid=`echo $line|awk -F"," {'print $6'}`
  area=`nawk -F"," -v cellid=$cellid '{if($1==cellid) print $2}' file2`
  echo "$line,$area"  >> result.txt
done

Hope you can help.

Thanks in advance!

Last edited by Scrutinizer; 07-13-2012 at 10:25 AM.. Reason: code tags for data files
 

10 More Discussions You Might Find Interesting

1. UNIX for Dummies Questions & Answers

Help needed in search string

Hi , I learning shell scripting.. I need to do the following in my shell script. Search a given logfile for two\more strings. If the the two strings are found. write it to a outputfile if only one of the string is found, write the found string in one output file and other in other... (2 Replies)
Discussion started by: amitrajvarma
2 Replies

2. UNIX for Advanced & Expert Users

search a replace each line- help needed ASAP

can someone help me with the find and replace command. I have a input file which is in the below format: 0011200ALN00000000009EGYPT 000000000000199900000 0011200ALN00000000009EGYPT 000000000000199900000 0011200ALN00000000008EGYPT 000000000000199800000 0011200ALN00000000009EGYPT ... (20 Replies)
Discussion started by: bsandeep_80
20 Replies

3. Shell Programming and Scripting

Complex Search/Replace Multiple Files Script Needed

I have a rather complicated search and replace I need to do among several dozen files and over a hundred occurrences. My site is written in PHP and throughout the old code, you will find things like die("Operation Aborted due to....."); For my new design skins for the site, I need to get... (2 Replies)
Discussion started by: UCCCC
2 Replies

4. Shell Programming and Scripting

Printing 10 lines above and below the search string: help needed

Hi, The below code will search a particular string(say false in this case) and return me 10 lines above and below the search string in a file. " awk 'c-->0;$0~s{if(b)for(c=b+1;c>1;c--)print r;print("***********************************");print;c=a;}b{r=$ 0}' b=10 a=10 s="false" " ... (5 Replies)
Discussion started by: vimalm22
5 Replies

5. Shell Programming and Scripting

Help needed with basic search

hi, im trying to find the longest word in /usr/share/dict/words that does not contain the letter i. i've tried using the wc -L command like so: $ wc -L /usr/share/dict/words which basically tells me the longest word which is good but how do i get the longest word which Does not contain the... (7 Replies)
Discussion started by: tryintolearn
7 Replies

6. Shell Programming and Scripting

search needed part in text file (awk?)

Hello! I have text file: From aaa@bbb Fri Jun 1 10:04:29 2010 --____OSPHWOJQGRPHNTTXKYGR____ Content-Type: text/plain; charset=utf-8 Content-Transfer-Encoding: quoted-printable Content-Disposition: inline My code '234565'. ... (2 Replies)
Discussion started by: candyme
2 Replies

7. Shell Programming and Scripting

Search for a pattern and replace. Help needed

I have three variables $a, $b and $c $a = file_abc_123.txt $b = 123 $c = 100 I want to search if $b is present in $a. If it is present, then i want to replace that portion by $c. Here $b = 123 is present in "file_abc_123.txt", so i need the output as "file_abc_100.txt' How can this be... (3 Replies)
Discussion started by: irudayaraj
3 Replies

8. UNIX for Dummies Questions & Answers

Help needed - find command for recursive search

Hi All I have a requirement to find the file that are most latest to be modified in each directory. Can somebody help with the command please? E.g of the problem. The directory A is having sub directory which are having subdirectory an so on. I need a command which will find the... (2 Replies)
Discussion started by: sudeep.id
2 Replies

9. Shell Programming and Scripting

Recursive folder search faster than find?

I'm trying to find folders created by a propritary data aquisition software with the .aps ending--yes, I have never encountered folder with a suffix before (some files also end in .aps) and sort them by date. I need the whole path ls -dt "$dataDir"*".aps"does exactly what I want except for the... (2 Replies)
Discussion started by: Michael Stora
2 Replies

10. Shell Programming and Scripting

A faster way to read and search

I have a simple script that reads in data from fileA.txt and searches line by line for that data in multiple files (*multfiles.txt). It only prints the data when there is more than 1 instance of it. The problem is that its really slow (3+ hours) to complete the entire process. There are nearly 1500... (10 Replies)
Discussion started by: ncwxpanther
10 Replies
comm(1) 							   User Commands							   comm(1)

NAME
comm - select or reject lines common to two files SYNOPSIS
comm [-123] file1 file2 DESCRIPTION
The comm utility reads file1 and file2, which must be ordered in the current collating sequence, and produces three text columns as output: lines only in file1; lines only in file2; and lines in both files. If the input files were ordered according to the collating sequence of the current locale, the lines written will be in the collating sequence of the original lines. If not, the results are unspecified. OPTIONS
The following options are supported: -1 Suppresses the output column of lines unique to file1. -2 Suppresses the output column of lines unique to file2. -3 Suppresses the output column of lines duplicated in file1 and file2. OPERANDS
The following operands are supported: file1 A path name of the first file to be compared. If file1 is -, the standard input is used. file2 A path name of the second file to be compared. If file2 is -, the standard input is used. USAGE
See largefile(5) for the description of the behavior of comm when encountering files greater than or equal to 2 Gbyte ( 2^31 bytes). EXAMPLES
Example 1 Printing a list of utilities specified by files If file1, file2, and file3 each contain a sorted list of utilities, the command example% comm -23 file1 file2 | comm -23 - file3 prints a list of utilities in file1 not specified by either of the other files. The entry: example% comm -12 file1 file2 | comm -12 - file3 prints a list of utilities specified by all three files. And the entry: example% comm -12 file2 file3 | comm -23 -file1 prints a list of utilities specified by both file2 and file3, but not specified in file1. ENVIRONMENT VARIABLES
See environ(5) for descriptions of the following environment variables that affect the execution of comm: LANG, LC_ALL, LC_COLLATE, LC_CTYPE, LC_MESSAGES, and NLSPATH. EXIT STATUS
The following exit values are returned: 0 All input files were successfully output as specified. >0 An error occurred. ATTRIBUTES
See attributes(5) for descriptions of the following attributes: +-----------------------------+-----------------------------+ | ATTRIBUTE TYPE | ATTRIBUTE VALUE | +-----------------------------+-----------------------------+ |Availability |SUNWesu | +-----------------------------+-----------------------------+ |CSI |enabled | +-----------------------------+-----------------------------+ |Interface Stability |Standard | +-----------------------------+-----------------------------+ SEE ALSO
cmp(1), diff(1), sort(1), uniq(1), attributes(5), environ(5), largefile(5), standards(5) SunOS 5.11 3 Mar 2004 comm(1)
All times are GMT -4. The time now is 02:08 AM.
Unix & Linux Forums Content Copyright 1993-2022. All Rights Reserved.
Privacy Policy