Sponsored Content
Full Discussion: File processing
Top Forums Shell Programming and Scripting File processing Post 302414157 by naveen@ on Monday 19th of April 2010 10:57:35 AM
Old 04-19-2010
File processing

Hi guys,I have 2 sets of files. File A has around 4 million lines. The format is

fileA

Code:
test.term.n4814 test.term.n3199
test.term.n4814 test.term.n4803
test.term.n4814 test.term.n_1767
test.term.n4810 test.term.n_3708
test.term.n4811 test.term.n_3745
test.term.n4817 test.term.n_3869
test.term.n4812 test.term.n_64430
test.term.n4814 test.term.n_75678
test.term.n4814 test.term.n_75686
test.term.n4819 test.term.n_75702
test.term.n4812 test.term.n_77979
test.term.n4818 test.term.n_78077
test.term.n4813 test.term.n_78522
test.term.n4815 test.term.n_87649
test.term.n4817 test.term.n_87818

File B has fewer lines [ a few thousands] for e.g.

fileB
Code:
test.term.n_75702
test.term.n4819
test.term.n4814
term.n_78077

I am trying to write a script which will search if a line from file B exisits in File A and then delete it. The script I have written is

filter.sh
Code:
#!/bin/sh
cp fileA temp1 
while read line 
do 
cp temp2 temp1 
sed "/${line}/d" temp1 > temp2 
done < fileB 
cp temp2 filtered_fileA 
\rm temp*


This script works for small files like above.

However the actual file which has over 4 million lines has been running for more than 4-5 days.

Is there a faster and easier way to do this on large files?

And the expected output is

filtered_fileA
Code:
test.term.n4810 test.term.n_3708
test.term.n4811 test.term.n_3745
test.term.n4817 test.term.n_3869
test.term.n4812 test.term.n_64430 
test.term.n4812 test.term.n_77979 
test.term.n4813 test.term.n_78522 
test.term.n4815 test.term.n_87649 
test.term.n4817 test.term.n_87818


Thanks in advance,
Naveen

Last edited by naveen@; 04-19-2010 at 12:06 PM.. Reason: formatting issue
 

9 More Discussions You Might Find Interesting

1. Shell Programming and Scripting

Have a shell script check for a file to exist before processing another file

I have a shell script that runs all the time looking for a certain type of file and then it processes the file through a series of other scripts. The script is watching a directory that has files uploaded to it via SFTP. It already checks the size of the file to make sure that it is not still... (3 Replies)
Discussion started by: heprox
3 Replies

2. Shell Programming and Scripting

Checking for a control file before processing a data file

Hi All, I am very new to Shell scripting... I got a requirement. I will have few text files(data files) in a particular directory. they will be with .txt extension. With same name, but with a different extension control files also will be there. For example, Sample_20081001.txt is the data... (4 Replies)
Discussion started by: purna.cherukuri
4 Replies

3. Shell Programming and Scripting

how to change the current file processing to some other random file in awk ?

Hello, say suppose i am processing an file emp.dat the field of which are deptno empno empname etc now say suppose i want to change the file to emp.lst then how can i do it? Here i what i attempted but in vain BEGIN{ system("sort emp.dat > emp.lst") FILENAME="emp.lst" } { print... (2 Replies)
Discussion started by: salman4u
2 Replies

4. Shell Programming and Scripting

How to processing the log file within certain dates based on the file name

Hi I am working on the script parsing specific message "TEST" from multiple file. The log file name looks like: N3.2009-11-26-03-05-02.console.log.tar.gz N4.2009-11-29-00-25-03.console.log.tar.gz N6.2009-12-01-10-05-02.console.log.tar.gz I am using the following command: zgrep -a --text... (1 Reply)
Discussion started by: shyork2001
1 Replies

5. Shell Programming and Scripting

How to make parallel processing rather than serial processing ??

Hello everybody, I have a little problem with one of my program. I made a plugin for collectd (a stats collector for my servers) but I have a problem to make it run in parallel. My program gathers stats from logs, so it needs to run in background waiting for any new lines added in the log... (0 Replies)
Discussion started by: Samb95
0 Replies

6. Shell Programming and Scripting

File Processing

i am having the input file as below 123456789: xxxxx12xxxxxxxxxxxxxxxxxx a_cnt 123456789: xxxxxxxxxxxxxxxxxxxxxxx a_cnt 123456789: a_cnt xxxxaq1wsxxxxxxxxxxxx12xxxxxxxxxx 123456789: xxxxxxxxxxxxasxxxx a_cnt i need the numbers in the backets of a_cnt O/p required as below 1 2 3 4... (2 Replies)
Discussion started by: expert
2 Replies

7. Shell Programming and Scripting

Recursive file processing from a path and printing output in a file

Hi All, The script below read the path and searches for the directories/subdirectories and for the files. If files are found in the sub directories then read the content of the all files and put the content in csv(comma delimted) format and the call the write to xml function to write the std... (1 Reply)
Discussion started by: Optimus81
1 Replies

8. Programming

awk processing / Shell Script Processing to remove columns text file

Hello, I extracted a list of files in a directory with the command ls . However this is not my computer, so the ls functionality has been revamped so that it gives the filesizes in front like this : This is the output of ls command : I stored the output in a file filelist 1.1M... (5 Replies)
Discussion started by: ajayram
5 Replies

9. UNIX for Dummies Questions & Answers

awk - Rename output file, after processing, same as input file

I have one input file ABC.txt and one output DEF.txt. After the ABC is processed and created output, I want to rename ABC.txt to ABC.orig and DEF to ABC.txt. Currently when I am doing this, it does not process the input file as it cannot read and write to the same file. How can I achieve this? ... (12 Replies)
Discussion started by: High-T
12 Replies
term::send(n)							 Terminal control						     term::send(n)

__________________________________________________________________________________________________________________________________________________

NAME
term::send - General output to terminals SYNOPSIS
package require Tcl 8.4 package require term::send ?0.1? ::term::send::wrch chan str ::term::send::wr str _________________________________________________________________ DESCRIPTION
This package provides the most primitive commands for sending characters to a terminal. They are in essence convenient wrappers around the builtin command puts. ::term::send::wrch chan str Send the text str to the channel specified by the handle chan. In contrast to the builtin command puts this command does not termi- nate the string with a line terminator. It also forces an flush of Tcl internal and OS buffers to ensure that the characters are processed immediately. ::term::send::wr str This convenience command is like ::term::send::wrch, except that the destination channel is fixed to stdout. BUGS, IDEAS, FEEDBACK This document, and the package it describes, will undoubtedly contain bugs and other problems. Please report such in the category term of the Tcllib SF Trackers [http://sourceforge.net/tracker/?group_id=12883]. Please also report any ideas for enhancements you may have for either package and/or documentation. KEYWORDS
character output, control, terminal CATEGORY
Terminal control COPYRIGHT
Copyright (c) 2006 Andreas Kupries <andreas_kupries@users.sourceforge.net> term 0.1 term::send(n)
All times are GMT -4. The time now is 01:24 AM.
Unix & Linux Forums Content Copyright 1993-2022. All Rights Reserved.
Privacy Policy