The UNIX and Linux Forums  
Hello and Welcome from United States to the UNIX and Linux Forums! Thank You for Visiting and Joining Our Global Community.

Go Back   The UNIX and Linux Forums > Top Forums > Shell Programming and Scripting
.
google unix.com



Shell Programming and Scripting Post questions about KSH, CSH, SH, BASH, PERL, PHP, SED, AWK and OTHER shell scripts and shell scripting languages here.

More UNIX and Linux Forum Topics You Might Find Helpful
Thread Thread Starter Forum Replies Last Post
Splitting input files into multiple files through AWK command arund_01 Shell Programming and Scripting 3 05-13-2008 10:17 AM
splitting files based on text in the file matrix1067 Shell Programming and Scripting 1 01-30-2006 08:45 PM
merging few columns of two text files to a new file kolvi Shell Programming and Scripting 4 09-15-2005 05:34 AM
Splitting large file into small files dncs Shell Programming and Scripting 4 06-08-2005 12:02 PM
grep multiple text files in folder into 1 text file? coppertone UNIX for Dummies Questions & Answers 7 08-23-2002 03:50 PM

Closed Thread
English Japanese Spanish French German Portuguese Italian Dutch Swedish Russian Norwegian Hungarian Hebrew Danish Bulgarian Greek Powered by Powered by Google
 
LinkBack Thread Tools Search this Thread Rate Thread Display Modes
  #1 (permalink)  
Old 03-13-2008
JeffV JeffV is offline
Registered User
  
 

Join Date: Mar 2008
Posts: 3
Splitting text file to several other files using sed.

I'm trying to figure out how to do this efficiently with as little execution time as possible and I'm pretty sure using sed is the best way. However I'm new to sed and all the reading and examples I've found don't seem to show a similar exercise:

I have a long text file (i'll call it all_files.txt) listing all the files on the system, each line showing the checksum, permissions, date, and file name with path. For example:

683706D9 104775 Sep 27 12:00:04 1999 /bin/Audio
4C799E06 100775 Nov 14 17:33:11 1997 /bin/Blkfsys
C851669A 104775 Oct 04 14:08:38 1996 /bin/Dev16
CA4B42E7 100775 Nov 21 11:58:06 1996 /bin/Dev16.ansi
FF4396D0 100775 Oct 04 14:06:03 1996 /bin/Dev16.par

Some of these files are categorized according to some other text files listing the files belonging to that category. For example, the file catA.dat may be the following:

/bin/Dev16.par
/bin/some_other_file
/home/stuff/another_file

Similar lists would exist for catB.dat and catC.dat.

What should happen is all the lines in the original file which belong to a certain category will be deleted from the original file and copied to a new file, say catA_list and catB_list, etc. So in the end only the files not assigned to any category are left in all_files.txt.

Is there an easy way to do this? I've figured out how to use sed to delete lines, but to output them to 3 different files based on matches from reference text files is confusing me. Any ideas would be greatly appreciated!!
  #2 (permalink)  
Old 03-13-2008
fpmurphy's Avatar
fpmurphy fpmurphy is offline Forum Staff  
Moderator
  
 

Join Date: Dec 2003
Location: Florida
Posts: 1,930
The following example demonstrates how to write results
out to 3 different files

Code:
#!/usr/bin/ksh

tmp=file.$$

cat <<EOT >$tmp
683706D9 104775 Sep 27 12:00:04 1999 /bin/Audio
4C799E06 100775 Nov 14 17:33:11 1997 /bin/Blkfsys
C851669A 104775 Oct 04 14:08:38 1996 /bin/Dev16
CA4B42E7 100775 Nov 21 11:58:06 1996 /bin/Dev16.ansi
FF4396D0 100775 Oct 04 14:06:03 1996 /bin/Dev16.par
EOT

sed -n -e '/^68/w ./out1' -e '/^C8/w ./out2' -e '/^CA/w ./out3' $tmp

rm $tmp
exit 0
#!/usr/bin/ksh tmp=file.$$ cat <<EOT >$tmp 683706D9 104775 Sep 27 12:00:04 1999 /bin/Audio 4C799E06 100775 Nov 14 17:33:11 1997 /bin/Blkfsys C851669A 104775 Oct 04 14:08:38 1996 /bin/Dev16 CA4B42E7 100775 Nov 21 11:58:06 1996 /bin/Dev16.ansi FF4396D0 100775 Oct 04 14:06:03 1996 /bin/Dev16.par EOT sed -n -e '/^68/w ./out1' -e '/^C8/w ./out2' -e '/^CA/w ./out3' $tmp rm $tmp exit 0
-->
  #3 (permalink)  
Old 03-13-2008
JeffV JeffV is offline
Registered User
  
 

Join Date: Mar 2008
Posts: 3
Thanks! That helps, although I'm seeing other complications here. For one, I won't know what I'm searching for since this will come as input in from other files (catA_list, catB_list, catC_list). I was initially thinking I could use a loop to read each file from the category lists, working on each category at a time:

while read AFILE
do
sed -n -e '\|"$AFILE"$| {
w /catA_list
d
}' <all_files.txt>tmpfile
done<catA.dat


However 2 more loops would have to be used for Category B and Category C files. Essentially this would be looping through the file many times. I'm not even sure this would work correctly. I'm thinking there has to be a more efficient way to do this.
  #4 (permalink)  
Old 03-14-2008
JeffV JeffV is offline
Registered User
  
 

Join Date: Mar 2008
Posts: 3
Ok, I've figured out a solution that mostly works, however I can't get it to work when passing an argument into the sed regular expression. I basically have this for extracting any line for a file in category A:

while read AFILE
do
sed -e '\|'"$AFILE"'$|{
w /tmp/catA_list
d
}' </tmp/all_files>/tmp/non_AFiles

done<catA.dat


The regular expression works fine if I substitute a specific file name. But passing the argument this way it will only find one of the files in the catA list and not the others. Is there a better way to pass the argument here?

Also the file name includes the full path, hence why '/' is not being used as a delimiter for the expression.
Closed Thread

Bookmarks

Thread Tools Search this Thread
Search this Thread:

Advanced Search
Display Modes Rate This Thread
Rate This Thread:

Posting Rules
You may not post new threads
You may not post replies
You may not post attachments
You may not edit your posts

BB code is On
Smilies are On
[IMG] code is On
HTML code is Off
Trackbacks are On
Pingbacks are On
Refbacks are On




All times are GMT -4. The time now is 09:22 PM.


Powered by: vBulletin, Copyright ©2000 - 2006, Jelsoft Enterprises Limited. Language Translations Powered by .
vBCredits v1.4 Copyright ©2007 - 2008, PixelFX Studios
The UNIX and Linux Forums Content Copyright ©1993-2009. All Rights Reserved.Ad Management by RedTyger

Content Relevant URLs by vBSEO 3.2.0