Filtering data -extracting specific lines


 
Thread Tools Search this Thread
Top Forums UNIX for Dummies Questions & Answers Filtering data -extracting specific lines
# 1  
Old 05-16-2012
Filtering data -extracting specific lines

I have a table to data which one of the columns include string of text

from within that, I am searching to include few lines but not others

for example I want to to include some combination of word address such as (address.| address? |the address | your address) but not (ip address | email address | address bar )

in include the things that i want, i have used
Code:
egrep 'address| your address'

but now i have problems with things i dont want to include
i have used the
Code:
grep -v

but i dont know how to combine it so it will delete the things that i dont want

so simple how to delete lines that do have x or y or z

I am not sure whether i have to used grep, sed or awk either

cheers
# 2  
Old 05-16-2012
You could do it with all of the above, actually, they're multipurpose -- some lend themselves to some tasks more easily than others though. grep is relatively straightforward, so I'll use it here:

Code:
egrep "accept1|accept2|accept3" inputfile | egrep -v "reject1|reject2|reject3" > outputfile

egrep is 'extended grep', which supports things like | for multiple expressions in one regex.
This User Gave Thanks to Corona688 For This Post:
# 3  
Old 05-16-2012
how can I deal with the [.,?!] in this case ?
because I realized it does not pick up the ones finishing with signs
# 4  
Old 05-16-2012
Hi.

Meta-solution:
Code:
To obtain the best answers quickly for processing datasets --
extracting, transforming, filtering, you should, after having
searched for answers (man pages, Google, etc.):

1. Post representative samples of your data (i.e.  data that
should "succeed" and data that should "fail")

2. Post what you expect the results to be, in addition to
describing them.  Be clear about how the results are to be
obtained, e.g.  "add field 2 from file1 to field 3 from file2",
"delete all lines that contain 'possum', etc. 

3. Post what you have attempted to do so far.  Post scripts,
programs, etc.  within CODE tags.  If you have a specific
question about an error, please post the shortest example of the
code, script, etc. that exhibits the problem. 

4. Place the data and expected output within CODE tags, so that
they are more easily readable. 

5. If you require the use of a specific shell or command,
explain why that is the case: if you cannot solve a problem, it
may be because you do not know about or enough about a software
tool, in which case the responders are probably better judges of
a solution than you are. 

Special cases, exceptions, etc., are very important to include
in the samples.

Best wishes ... cheers, drl
# 5  
Old 05-16-2012
Escape them like \. \? \* \+

You could also put them into a file and use -F to tell grep they're fixed strings:

Code:
$ cat acceptfile

a
b
c

$ cat rejectfile

d
e
f

$ grep -F -f acceptfile inputfile | grep -v -F -f rejectfile > outputfile

Be sure that acceptfile and rejectfile don't contain any blank lines. Blank lines will accept everything, or reject everything, respectively.
This User Gave Thanks to Corona688 For This Post:
# 6  
Old 05-18-2012
I used 2 days ago and it worked amazingly...

now that i tried to do the same thing, it only shows the rows which contain the last word on the acceptfile. its like its overriding everything else. and it there is a something which does not exist in my list, it all comes back black and deletes everything else

do you have an idea what the problem is ?
# 7  
Old 05-18-2012
Can you show your exact script, and a sample of your input and output -- specifically, some lines which should be accepted but aren't be?

And of course, check for blank lines in your accept/reject files, that's caught me a few times.
This User Gave Thanks to Corona688 For This Post:
 
Login or Register to Ask a Question

Previous Thread | Next Thread

10 More Discussions You Might Find Interesting

1. Shell Programming and Scripting

Extracting data from specific rows and columns from multiple csv files

I have a series of csv files in the following format eg file1 Experiment Name,XYZ_07/28/15, Specimen Name,Specimen_001, Tube Name, Control, Record Date,7/28/2015 14:50, $OP,XYZYZ, GUID,abc, Population,#Events,%Parent All Events,10500, P1,10071,95.9 Early Apoptosis,1113,11.1 Late... (6 Replies)
Discussion started by: pawannoel
6 Replies

2. Shell Programming and Scripting

Extracting data from multiple lines

Hi All, I am stuck in one step.. I have one file named file.txt having content: And SGMT.perd_id = (SELECT cal.fiscal_perd_id FROM $ODS_TARGT.TIM_DT_CAL_D CAL FROM $ODS_TARGT.GL_COA_SEGMNT_XREF_A SGMT SGMT.COA_XREF_TYP_IDN In (SEL COA_XREF_TYP_IDN From... (4 Replies)
Discussion started by: Shilpi Gupta
4 Replies

3. UNIX for Dummies Questions & Answers

Extracting data between specific lines, multiple times

I need help extracting specific lines in a text file. The file looks like this: POSITION TOTAL-FORCE (eV/Angst) ----------------------------------------------------------------------------------- 1.86126 1.86973 1.86972 ... (14 Replies)
Discussion started by: captainalright
14 Replies

4. UNIX for Advanced & Expert Users

Extracting specific lines from data file

Hello, Is there a quick awk one-liner for this extraction?: file1 49389 text55 52211 text66 file2 59302 text1 49389 text2 85939 text3 52211 text4 13948 text5 Desired output 49389 text2 52211 text4 Thanks!! (5 Replies)
Discussion started by: palex
5 Replies

5. Shell Programming and Scripting

Extracting Tag along with specific lines

I have this input file: and the desired output is as follows: Desired Output This is a sample taken from a huge file. Basically, the script should take the tag (TDK11..1>) add everything that has bukle=A until it sees the blank lines. Then takes the next tag (TDK2222>) adds everything that... (4 Replies)
Discussion started by: Ernst
4 Replies

6. Shell Programming and Scripting

Extracting specific lines of data from a file and related lines of data based on a grep value range?

Hi, I have one file, say file 1, that has data like below where 19900107 is the date, 19900107 12 144 129 0.7380047 19900108 12 168 129 0.3149017 19900109 12 192 129 3.2766666E-02 ... (3 Replies)
Discussion started by: Wynner
3 Replies

7. Shell Programming and Scripting

extracting specific text from lines

Hello, i've got this output text: and i need it to look something like this: which means that there won't be absolute path of each directory, just it's size and the last word after last '/' in each line, and i also don't need last line '1.7M /tmp' Looks like there is a simple... (5 Replies)
Discussion started by: krater559
5 Replies

8. Shell Programming and Scripting

Using Awk for extracting data in specific format

please help me writing a awk script 001_r.pdb 0.0265185 001_r.pdb 0.0437049 001_r.pdb 0.0240642 001_r.pdb 0.0310264 001_r.pdb 0.0200482 001_r.pdb 0.0146746 001_r.pdb 0.0351344 001_r.pdb 0.0347856 001_r.pdb 0.036119 001_r.pdb 1.49 002_r.pdb 0.0281011 002_r.pdb 0.0319908 002_r.pdb... (5 Replies)
Discussion started by: phoenix_nebula
5 Replies

9. Shell Programming and Scripting

extracting specific lines from a file

hi all, i searched in unix.com and accquired the following commands for extracting specific lines from a file .. sed -n '16482,16482p' in.sql > out.sql awk 'NR>=10&&NR<=20' in.sql > out.sql.... these commands are working fine if i give the line numbers as such .. but if i pass a... (2 Replies)
Discussion started by: sais
2 Replies

10. Shell Programming and Scripting

Extracting text out of specific lines

Hi, I have a file like LAHORE 2009-04-16 16:04:19 THU S5830 FAULT MESSAGE SUPPRESS STATUS LOC : ASP00 STS : SUPPRESSING CONTINUE INF : F6201 TRUNK. DATA FAULT REPORT COMPLETED LAHORE 2009-04-16 16:04:20 THU S8400 ISUP SIGNALLING TRACE -... (3 Replies)
Discussion started by: krabu
3 Replies
Login or Register to Ask a Question