The UNIX and Linux Forums  
Hello and Welcome from United States to the UNIX and Linux Forums! Thank You for Visiting and Joining Our Global Community.

Go Back   The UNIX and Linux Forums > Top Forums > Shell Programming and Scripting
.
google unix.com



Shell Programming and Scripting Post questions about KSH, CSH, SH, BASH, PERL, PHP, SED, AWK and OTHER shell scripts and shell scripting languages here.

More UNIX and Linux Forum Topics You Might Find Helpful
Thread Thread Starter Forum Replies Last Post
Reading lines from a file, using bash, "at" command jbsimon000 Shell Programming and Scripting 3 03-17-2009 03:53 PM
"find command" to find the files in the current directories but not in the "subdir" swamymns Shell Programming and Scripting 9 07-22-2008 11:23 AM
Delete lines ending in "_;" using sed turbulence Shell Programming and Scripting 12 01-17-2008 06:51 PM
Development Releases: Linux Mint 4.0 Beta "Fluxbox", 4.0 Alpha "Debian" iBot UNIX and Linux RSS News 0 01-04-2008 03:00 PM
Explain the line "mn_code=`env|grep "..mn"|awk -F"=" '{print $2}'`" Lokesha UNIX for Dummies Questions & Answers 4 12-20-2007 01:52 AM

Reply
English Japanese Spanish French German Portuguese Italian Dutch Swedish Russian Norwegian Hungarian Hebrew Danish Powered by Powered by Google
 
LinkBack Thread Tools Search this Thread Rate Thread Display Modes
  #1 (permalink)  
Old 10-04-2009
devlin devlin is offline
Registered User
  
 

Join Date: Oct 2009
Posts: 3
Arrow Sed: Delete lines in files that contain other than a-z ,0-9 and "."

Sed: Delete lines in files that contain other than 'a-z' ,'0-9', '.' and '-'

Hello,

I'm looking for a shell command or maybe a small php loop to delete lines in files.txt (in the same directory) that contain character other then 'a-z' ,'0-9', '.' and '-'

All line that have characters like éèÈ etc... will got his line deleted. I don't want to see the output (it's larges files +- 5meg, and +- 100 files)

It's probably a combinasion of Sed and Regex but i'm unable to find the good syntax to do it

Every help will be appreciated.

Thanks
  #2 (permalink)  
Old 10-04-2009
scottn scottn is offline Forum Advisor  
VIP Member
  
 

Join Date: Jun 2009
Location: Zürich, CH
Posts: 1,042
Hi.

You mentioned "files.txt" and "100 files". Can you be more specific about from which file(s) the text should be deleted?

(assuming all files in directory...)
bash Code:
  1. ls | while read FILE; do
  2.   sed -n "/^[a-z0-9.-]\+$/ p" $FILE > FILE.tmp.$$
  3.   cp -f $FILE.tmp.$$ $FILE && rm $FILE.tmp.$$
  4. done

Or
bash Code:
  1. ls | while read FILE; do
  2.   grep "^[a-z0-9.-]\+$" $FILE > $FILE.tmp.$$
  3.   cp -f $FILE.tmp.$$ $FILE && rm $FILE.tmp.$$
  4. done

Edit: Highlight=bash bbcode added by neo .....

Last edited by scottn; 10-04-2009 at 03:55 PM.. Reason: Always take a backup before something like this!
  #3 (permalink)  
Old 10-04-2009
devlin devlin is offline
Registered User
  
 

Join Date: Oct 2009
Posts: 3
Thanks for your reply Scottn

Your code give me the biggest hint for the last 3 days

My files are in fact sitemap, like sitemap.1.xml, sitemap.2.xml, sitemap.3.xml, ...
and forgot to mention that I also need to include '<', '>', ':', '/'

I tried to use this code but the ':' is not correctly set in this line I think...

(not working correctly)
Code:
sed -n "/^[<>\:a-z0-9.-\/]\{1,\}$/ p" sitemap.1.xml > sitemap.1.xml.tmp;mv sitemap.1.xml.tmp sitemap.1.xml
  #4 (permalink)  
Old 10-04-2009
scottn scottn is offline Forum Advisor  
VIP Member
  
 

Join Date: Jun 2009
Location: Zürich, CH
Posts: 1,042
Hi.

Sed does seem to be somewhat pedantic about where bits go!

bash Code:
  1. ls sitemap.*.xml | while read FILE; do
  2.   sed -n "/^[a-z<>/0-9.:-]\+$/ p" $FILE > FILE.tmp.$$
  3.   cp -f $FILE.tmp.$$ $FILE && rm $FILE.tmp.$$
  4. done
  #5 (permalink)  
Old 10-04-2009
devlin devlin is offline
Registered User
  
 

Join Date: Oct 2009
Posts: 3
Nice !!
Thanks

Didn't know about the position thing..!

Thanks for your great help
Reply

Bookmarks

Tags
delete, files, not contain, regex, sed

Thread Tools Search this Thread
Search this Thread:

Advanced Search
Display Modes Rate This Thread
Rate This Thread:

Posting Rules
You may not post new threads
You may not post replies
You may not post attachments
You may not edit your posts

BB code is On
Smilies are On
[IMG] code is On
HTML code is Off
Trackbacks are On
Pingbacks are On
Refbacks are On




All times are GMT -4. The time now is 01:20 AM.


Powered by: vBulletin, Copyright ©2000 - 2006, Jelsoft Enterprises Limited. Language Translations Powered by .
vBCredits v1.4 Copyright ©2007 - 2008, PixelFX Studios
The UNIX and Linux Forums Content Copyright ©1993-2009. All Rights Reserved.Ad Management by RedTyger

Content Relevant URLs by vBSEO 3.2.0