grep only from a range of columns

04-30-2012

Registered User

71, 1

Join Date: Apr 2012

Last Activity: 5 February 2017, 4:01 PM EST

Posts: 71

Thanks Given: 23

Thanked 1 Time in 1 Post

grep only from a range of columns

Hello all,

I have a .csv file with over 100 columns. I would like to grep for a pattern only searching within a range of those fields, and print the entire line. For example: grep a pattern from columns $47-$87, but print fields $0 - $100

Thanks!

torchij

View Public Profile for torchij

Find all posts by torchij

04-30-2012

Registered User

23,310, 4,623

Join Date: Aug 2005

Last Activity: 7 July 2020, 11:47 AM EDT

Location: Saskatchewan

Posts: 23,310

Thanks Given: 1,331

Thanked 4,623 Times in 4,217 Posts

grep doesn't understand columns. awk does, however.

Is it an actual csv, as in, comma-separated?

Code:

awk -F"," '{ M=0; for(N=47; (!M) && (N<=87); N++) if($N ~ /regex/) M++; } M' filename

Corona688

View Public Profile for Corona688

Visit Corona688's homepage!

Find all posts by Corona688

04-30-2012

Registered User

71, 1

Join Date: Apr 2012

Last Activity: 5 February 2017, 4:01 PM EST

Posts: 71

Thanks Given: 23

Thanked 1 Time in 1 Post

Thank you for such quick reply. Yes it is comma separated. Will this print every column? And I'm assuming 'regex' is where i will put my desired pattern?

torchij

View Public Profile for torchij

Find all posts by torchij

04-30-2012

Registered User

23,310, 4,623

Join Date: Aug 2005

Last Activity: 7 July 2020, 11:47 AM EDT

Location: Saskatchewan

Posts: 23,310

Thanks Given: 1,331

Thanked 4,623 Times in 4,217 Posts

It will print every column, and /regex/ is where you put your pattern, yes. The // are what you use instead of quotes, sort of like in sed.

Corona688

View Public Profile for Corona688

Visit Corona688's homepage!

Find all posts by Corona688

04-30-2012

Registered User

71, 1

Join Date: Apr 2012

Last Activity: 5 February 2017, 4:01 PM EST

Posts: 71

Thanks Given: 23

Thanked 1 Time in 1 Post

This command is working, but it seems to stop searching after line 34. My file has about 32,000 rows, and only 34 were printed. I know there will be at least 5000 rows containing my pattern.

torchij

View Public Profile for torchij

Find all posts by torchij

04-30-2012

Registered User

23,310, 4,623

Join Date: Aug 2005

Last Activity: 7 July 2020, 11:47 AM EDT

Location: Saskatchewan

Posts: 23,310

Thanks Given: 1,331

Thanked 4,623 Times in 4,217 Posts

Please post enough of your data to show the problem, and please show the exact program you ran, including the regex.

Corona688

View Public Profile for Corona688

Visit Corona688's homepage!

Find all posts by Corona688

04-30-2012

Registered User

71, 1

Join Date: Apr 2012

Last Activity: 5 February 2017, 4:01 PM EST

Posts: 71

Thanks Given: 23

Thanked 1 Time in 1 Post

Unfortunately I can't really show the data, it has patient information

. To be exact, I have 83 columns/fields and 32,224 rows/lines of data - comma seperated delimiter. I'm interested in the regex /hom/, but only as it appears in fields 43-83.

for example, a few lines will look like this:

PHP Code:


col    1 - 42     col 43      to       83
patient data      - - - - - hom - - - - -
patient data      - - - - - - - - - - - -  
patient data      - - hom - - - - - - - -
hom ....data      - - - - - - - - - - - -

And I don't want to return the last line, which contains a "hom" in col 1 - 42

The code I used:

PHP Code:


awk -F"," '{ M=0; for(N=43; (!M) && (N<=83); N++) if($N ~ /hom/) M++; } M' input.csv >hom.csv

This code works, but my output file hom.csv only has 34 lines returned. By visual inspection, this is only the first 34 instances of "hom" in columns 43-83. I know by looking at my file that in line 23743, there is a "hom".

Any reason why this may be happening? Or will you have to see the entire file?

Thanks

torchij

View Public Profile for torchij

Find all posts by torchij

UNIX for Dummies Questions & Answers

grep only from a range of columns

10 More Discussions You Might Find Interesting

1. UNIX for Beginners Questions & Answers

Removing letters after a certain character within a range of columns

Discussion started by: daashti

2. Shell Programming and Scripting

Reading specific range of columns in an Excel file

Discussion started by: Abhisrajput

3. Shell Programming and Scripting

Sum of range of rows and columns in matrix

Discussion started by: CAch

4. Shell Programming and Scripting

Grep ip range

Discussion started by: proactiveaditya

5. UNIX Desktop Questions & Answers

grep a range of text

Discussion started by: boaz733

6. UNIX for Dummies Questions & Answers

How to match 2 columns where one column has data as a range - extended

Discussion started by: underscore

7. Shell Programming and Scripting

Select columns from a matrix given within a range in BASH

Discussion started by: shoaibjameel123

8. Shell Programming and Scripting

awk to match a numeric range specified by two columns

Discussion started by: heecha

9. UNIX for Dummies Questions & Answers

How to match 2 columns where one column has data as a range

Discussion started by: auburn

10. Shell Programming and Scripting

displaying range of columns

Discussion started by: mahabunta