How to make a quick search through a script?


 
Thread Tools Search this Thread
Top Forums Shell Programming and Scripting How to make a quick search through a script?
# 1  
Old 04-28-2013
How to make a quick search through a script?

Hello,

I have a file that has more than 300K records (i.e contains numbers). I need to take these records and than search them in 40 files while each file has more than 1.8 million records.

I wrote a script, but its very slow and takes alot of time. I have tried to split my 300k records in 6 files containing 50k records each. And than ran parallel scripts to make search more faster but still searching among 40 files is still slower.

Here is what my script looks like:

t1 --> File containing more than 300k reocrds i.e integer numbers
Code:
head t1
3028797272
3028797391
3028797459
3028797826
3028797879

t2* --> more than 40 files while each file has more than 1.8M records.
Code:
head t2_DAILY_SDP59.DUMP_subscriber.v3.csv
3048924971,3048924971,0,0,,0,1,2,0,0,0,0,0,0,1,1,1,13,13,,0.000000,1,2014-09-14,2012-09-29,,2014-09-14,0,0,2012-10-09,1,1,,0,0,0,0,0,0,0,0,0,0,0,2012-09-14
3069660757,3069660757,0,0,,1,1,2,0,0,0,0,0,0,1,1,1,13,13,,0.900000,1,2015-04-19,2015-04-19,,2015-04-19,0,0,2015-04-19,0,0,,0,0,0,0,0,0,0,0,0,0,0,2012-10-24
3038103705,3038103705,0,0,,1,1,2,1,0,0,0,0,0,1,1,1,26,26,,9.870000,1,2015-04-25,2015-04-25,,2015-04-25,0,0,2015-04-25,0,0,,0,0,0,0,0,0,0,0,0,0,0,2008-02-25
3038902927,3038902927,0,0,,1,1,2,1,0,0,0,0,0,1,1,1,13,13,,6.460000,1,2015-01-14,2015-01-14,,2015-01-14,0,0,2015-01-14,0,0,,0,0,0,0,0,0,0,0,0,0,0,2011-09-04

Code:
#!/bin/bash
for a1 in `cat /dump/20130426_DAILY_SDP/parallelProcessing/t1*`
do
for b1 in `ls /dump/20130426_DAILY_SDP/parallelProcessing/t2*`
do
cat $b1|grep "$a1"|nawk -F "," '{print $1 " " $18}' >> /dump/20130426_DAILY_SDP/parallelProcessing/out_t1.log
grep $a1 /dump/20130426_DAILY_SDP/parallelProcessing/out_t1.log
if [ "$?" -eq "0" ]; then
break 1
fi
done
done

I have tried with PPSS script but its not working properly.

Is there a way to use "Grep efficently", so it could search more faster? System that i am runinng my script on has Solaris OS.

Any help would be much appreciated.

Thanks!!

Regasrds,
Umar

Last edited by Franklin52; 05-01-2013 at 03:36 AM.. Reason: Please use code tags
# 2  
Old 04-28-2013
That's a pretty inefficient script!
People would be able to help you better if you would post sample of the file t1 and of one of the t2* files.
# 3  
Old 04-28-2013
here you go

t1 --> File containing more than 300k reocrds i.e integer numbers
Code:
head t1
3028797272
3028797391
3028797459
3028797826
3028797879

t2* --> more than 40 files while each file has more than 1.8M records.
Code:
head t2_DAILY_SDP59.DUMP_subscriber.v3.csv
3048924971,3048924971,0,0,,0,1,2,0,0,0,0,0,0,1,1,1,13,13,,0.000000,1,2014-09-14,2012-09-29,,2014-09-14,0,0,2012-10-09,1,1,,0,0,0,0,0,0,0,0,0,0,0,2012-09-14
3069660757,3069660757,0,0,,1,1,2,0,0,0,0,0,0,1,1,1,13,13,,0.900000,1,2015-04-19,2015-04-19,,2015-04-19,0,0,2015-04-19,0,0,,0,0,0,0,0,0,0,0,0,0,0,2012-10-24
3038103705,3038103705,0,0,,1,1,2,1,0,0,0,0,0,1,1,1,26,26,,9.870000,1,2015-04-25,2015-04-25,,2015-04-25,0,0,2015-04-25,0,0,,0,0,0,0,0,0,0,0,0,0,0,2008-02-25
3038902927,3038902927,0,0,,1,1,2,1,0,0,0,0,0,1,1,1,13,13,,6.460000,1,2015-01-14,2015-01-14,,2015-01-14,0,0,2015-01-14,0,0,,0,0,0,0,0,0,0,0,0,0,0,2011-09-04


Last edited by Franklin52; 05-01-2013 at 03:37 AM.. Reason: Please use code tags
# 4  
Old 04-28-2013
You should really go back and edit the post to include code tags. Smilie
# 5  
Old 04-28-2013
Assuming that I got your requirement right, try:
Code:
cd /dump/20130426_DAILY_SDP/parallelProcessing/
awk 'FNR==NR{a[$1];next}$1 in a{print $1,$18}' t1 FS=, t2* > out_t1.log


Last edited by elixir_sinari; 04-28-2013 at 09:16 AM..
# 6  
Old 04-28-2013
My aim is to take numbers from file t1 (contains more than 300k records) and search them in 40 t2 files while each file contains 1.8M records.

If number in t1 is in t2 files than I take column 1 and 18th of the found row and save it in separate file. How can I do this efficiently and quickly?



---------- Post updated at 07:37 AM ---------- Previous update was at 07:35 AM ----------

Quote:
Originally Posted by elixir_sinari
Assuming that I got your requirement right, try:
Code:
cd /dump/20130426_DAILY_SDP/parallelProcessing/
awk 'FNR==NR{a[$1];next}$1 in a{print $1,$18}' t1 FS=, t2* > out_t1.log

elixir_sinari thanks for your response. Your script gives following error upon execution.

awk 'FNR==NR{a[$1];next}$1 in a{print $1,$18}' t1 FS=, t2* > testunix.log
awk: syntax error near line 1
awk: bailing out near line 1
pwd
/dump/20130426_DAILY_SDP/parallelProcessing
# 7  
Old 04-28-2013
Quote:
Originally Posted by umarsatti
My aim is to take numbers from file t1 (contains more than 300k records) and search them in 40 t2 files while each file contains 1.8M records.

If number in t1 is in t2 files than I take column 1 and 18th of the found row and save it in separate file. How can I do this efficiently and quickly?

---------- Post updated at 07:35 AM ---------- Previous update was at 07:34 AM ----------

elixir_sinari thanks for your response. Your script gives following error upon execution.

awk 'FNR==NR{a[$1];next}$1 in a{print $1,$18}' t1 FS=, t2* > testunix.log
awk: syntax error near line 1
awk: bailing out near line 1
pwd
/dump/20130426_DAILY_SDP/parallelProcessing
Use nawk on Solaris.
This User Gave Thanks to elixir_sinari For This Post:
Login or Register to Ask a Question

Previous Thread | Next Thread

9 More Discussions You Might Find Interesting

1. Web Development

Quick Fix for Google Search Console "Page is not mobile friendly"

Over the past 10 plus years, we have countless posts where the user did not use CODE tags or they used ICODE tags incorrectly. This has has the results of this site penalized by Google for having pages which are "not mobile friendly". So, working quietly in the background, in the thankless... (0 Replies)
Discussion started by: Neo
0 Replies

2. What is on Your Mind?

UserCP Prototype v0.53 Quick Search in Navbar

FYI. In version 0.53 of the new UserCP I am working on, the top navbar search works; but I'm still displaying the results in the main forums. I in the future, I may being to change this to display the results in the new UserCP. https://www.unix.com/members/1-albums225-picture1118.png (2 Replies)
Discussion started by: Neo
2 Replies

3. What is on Your Mind?

Updated Forum Search Index Min Word Length to 2 Chars and Added Quick Search Bar

Today I changed the forum mysql database to permit 2 letter searches: ft_min_word_len=2 I rebuilt the mysql search indexes as well. Then, I added a "quick search bar" at the top of each page. I have tested this and two letter searches are working; but it's not perfect,... (1 Reply)
Discussion started by: Neo
1 Replies

4. Shell Programming and Scripting

Search pattern between two quotes and make 1 row

Hi All, My file cat file " test1 test1 " " test1 test1 test1 test1" "test1 test1 test1 test1 test1 test1 "How to achieve this i want the result: cat file test1 test1 test1 test1 test1 test1 test1 test1 test1 test1 test1 test1 Please use CODE (not QUOTE) tags as required by... (4 Replies)
Discussion started by: lxdorney
4 Replies

5. UNIX for Dummies Questions & Answers

How to make search on gmane?

Hello. How can I make a search on string "install script" on site gmane in comp.sysutils.backup.bacula.general. Any help is welcome (0 Replies)
Discussion started by: jcdole
0 Replies

6. Shell Programming and Scripting

search between keywords and make a single line

have a very big file where need to format it like below example file: abcd today is great day; search keyword 'abcd' and append to it all words till we reach ; to make it a single line. output should look like. abcd today is great day; There are many occurrence of such... (2 Replies)
Discussion started by: giri4332
2 Replies

7. UNIX for Dummies Questions & Answers

search all file for particular text and make changes to line 3

Hi All, I am sitting on HPUX. I want to change the exit into #exit, which appears into 3red line of code in shell scripting, wondering how shell script to be called up to perform action. I have following code in all files. Now, I need to find the text exit and replace into #exit. #!/sbin/sh... (10 Replies)
Discussion started by: alok.behria
10 Replies

8. UNIX for Dummies Questions & Answers

quick question vi word search

hi, while in vi, we use /string to look for the particular string. after that operation, the particular string is highlighted in yellow color. how do i take the highlight off? thanks so much. (1 Reply)
Discussion started by: hobiwhenuknowme
1 Replies

9. Post Here to Contact Site Administrators and Moderators

Make the SEARCH button REALLY BIG

Neo, Maybe it would be best to make the search button/option more visible? I know when I came here the first time, I didn't really notice that option. Just an idea. It appears that some people either don't see it.. or the obvious, don't want to use it, but I'd suggest that it's more of the... (8 Replies)
Discussion started by: ober5861
8 Replies
Login or Register to Ask a Question