Sponsored Content
Top Forums Shell Programming and Scripting Incredibly inefficient cat | grep script Post 302913041 by Cludgie on Wednesday 13th of August 2014 12:50:06 PM
Old 08-13-2014
Incredibly inefficient cat | grep script

Hi there,

I have 2 files that I am trying to work on.

File 1 contains a reference list of unique subscriber numbers ( 7 million entries in total)

File 2 contains a list of the subscriber numbers and their tariff (15 million entries in total). This file is in the production system and hasn't had old subscribers removed for some time so more than half of the entries need removed.

I created the following couple of lines to try to obtain the active 7 million subscriber numbers and tariffs from the behemoth 15 million list

Code:
 cat accurate_list.csv | while read ref
> do
> grep $ref production_list.csv >> new_msisdn_list.csv
> done

While this is actually working, it's only producing around 20 entries per second, which will take days to complete.

I'm afraid I'm a complete noob and can't come up with anything more inventive. I'm sure awk/sed/perl or probably any other number of languages would be perfect for something like this.

Anyway any suggestions are gratefully received.

Thanks
Cludgie
 

10 More Discussions You Might Find Interesting

1. UNIX for Advanced & Expert Users

cat and grep not working

I am trying to cat a file and then grep that file for a number. I can do it fine on other files but this particular file will not do anything. I tried running it on an older file from the same device but it is just not working. The file is nothing more than a flat file on a unix box. Here is just a... (3 Replies)
Discussion started by: jphess
3 Replies

2. Shell Programming and Scripting

cat in the command line doesn't match cat in the script

Hello, So I sorted my file as I was supposed to: sort -n -r -k 2 -k 1 file1 | uniq > file2 and when I wrote > cat file2 in the command line, I got what I was expecting, but in the script itself ... sort -n -r -k 2 -k 1 averages | uniq > temp cat file2 It wrote a whole... (21 Replies)
Discussion started by: shira
21 Replies

3. Shell Programming and Scripting

Perl sum really inefficient!!

Hi all, I have a file like the following: ID, 2,Andrew,0,1,2,3,4,2,5,6,7,7,9,3,4,5,34,3,2,1,5,6,78,89,8,7,6...................... 4,James,0,6,7,0,5,6,4,7,8,9,6,46,6,3,2,5,6,87,0,341,0,5,2,5,6.................... END, (there are more entires on each line but to keep it simple I've left... (10 Replies)
Discussion started by: Donkey25
10 Replies

4. Shell Programming and Scripting

Problem with IF - CAT - GREP in simple shell script

Hi all, Here is my requirement I have to search 'ORA' word in out.log file,if it is present then i need to send that file (out.log) content to some mail id.If 'ORA' word is not in that file then i need to send 'load succesful' message to some mail id. The below the shell script is not... (5 Replies)
Discussion started by: mak_boop
5 Replies

5. Shell Programming and Scripting

cat /etc/passwd and grep -v on /etc/shells

Hi All, I'd like to do this cat /etc/passwd and grep -v on the /etc/shells list I'd like to find all shell that doesn't exist on the /etc/passwd. Is there an easy way without doing a egrep -v "/bin/sh|/bin/bash................"? How do I use a file /etc/shells as my list for... (4 Replies)
Discussion started by: itik
4 Replies

6. Shell Programming and Scripting

cat -n and grep

I am not sure if using cat -n is the most efficient way to split a file into multiple files, one file per line in the source file. I thought using cat -n would make it easy to process the file because it produces an output that numbers each line that I could then grep for with the regex "^ *$i".... (3 Replies)
Discussion started by: kapu
3 Replies

7. Shell Programming and Scripting

grep or cat using sed

Is there a way using grep or cat a file to create a new file based on whether the first 9 positions of each record is less than 399999999? This is a fixed file format. (3 Replies)
Discussion started by: ski
3 Replies

8. UNIX for Dummies Questions & Answers

Grep and cat combined

Hello, i need to search one word (snp1) from many files and copy the content of the columns of this word in new file. example: file 1: SNP BP CHR P snp1 1 3 0.01 snp2 2 2 0.05 . . file 2: SNP BP CHR P snp1 1 3 0.06 snp2 2 2 0.3 output... (6 Replies)
Discussion started by: biopsy
6 Replies

9. Shell Programming and Scripting

Replace cat and grep with <

Hello someone told me to use OS=`awk '{print int($3)}' < /etc/redhat-release` instead of OS=cat /etc/redhat-release | `awk '{print int($3)}'` any idea for the reason ? (5 Replies)
Discussion started by: nimafire
5 Replies

10. UNIX for Dummies Questions & Answers

Grep or cat The Whole Directory PROBLEMS :(

Hi Guys This is my first post so I am not sure how things go here. I'm sorry if I'm breaking the rule or something. Feel free to correct me about that :) So as I was saying... I'd been trying to grep this folder containing 900,000 txt files but seems no luck. I get either "No such file... (6 Replies)
Discussion started by: Nexeu
6 Replies
mlmmj-sub(1)						      General Commands Manual						      mlmmj-sub(1)

NAME
mlmmj-sub - subscribe address to a mailinglist run by mlmmj SYNOPSIS
mlmmj-sub -L /path/to/list [-a john@doe.org | -m str] [-c] [-C] [-d | -n] [-f] [-h] [-q] [-r | -R] [-s] [-U] [-V] -a: Email address to subscribe -c: Send welcome mail (unless requesting confirmation) -C: Request mail confirmation (unless switching versions) -d: Subscribe to digest version of the list -f: Force subscription (do not moderate) -h: This help -L: Full path to list directory -m: Moderation string -n: Subscribe to nomail version of the list -q: Be quiet (don't notify owner about the subscription) -r: Behave as if request arrived via email (internal use) -R: Behave as if confirmation arrived via email (internal use) -s: Don't send a mail to the subscriber if already subscribed -U: Don't switch to the user id of the listdir owner -V: Print version DESCRIPTION
This utility is used to subscribe people to the specified mailinglist. It will write the email address in a file with the name of the beginning letter of the email address getting subscribed in the <listdir>/subscribers.d/ directory. The digest version of the list is a list version where people receive postings to the list periodically (e.g. once a day) or when a large number of posts have accumulated. Digest subscribers are in the <listdir>/digesters.d/ directory. The nomail version of the list is a list version where people are subscribed like usual, but they won't receive any postings to the list. This is useful for people who read the mailinglist through a news gateway, but want to be able to post to the list. Nomail subscribers are in the <listdir>/nomailsubs.d/ directory. Unless the -U switch is used it will switch its user id to the user id owning the list directory. This is done to make sure that new files created are having correct permissions. If the given address is already subscribed to the list, but to a different version, the subscription is switched to that version, and con- firmation and moderation are bypassed. If the address is already subscribed to the version requested, a mail is sent to the subscriber, unless the -s switch is used. Subscription may be moderated (if <listdir>/control/submod exists) unless the -f switch is given. When a subscription is permitted by a gatekeeper, welcome messages are sent to the subscriber as usual, regardless of options given now. To ensure subscription is silent from the point of view of the subscriber, use -f, but neither -c nor -C. To inhibit notification of the owner, use -q. Use of -s is recommended to ensure you don't spam already-subscribed addresses by accident. SEE ALSO
mlmmj-unsub(1), setuid(2) AUTHORS
This manual page was written by the following persons: Soren Boll Overgaard <boll@debian.org> (based on html2man output) Mads Martin Jorgensen <mmj@mmj.dk> mlmmj-sub January 2010 mlmmj-sub(1)
All times are GMT -4. The time now is 03:19 AM.
Unix & Linux Forums Content Copyright 1993-2022. All Rights Reserved.
Privacy Policy