Grep a pattern in a key position a file


 
Thread Tools Search this Thread
Top Forums Shell Programming and Scripting Grep a pattern in a key position a file
# 1  
Old 05-21-2018
Question Grep a pattern in a key position a file

Hi guys,

So I have a file containing data of a marathon. Here's an example what it looks like including the given key:

Code:
# key: sex, time, athlete, athlete's nationality, date, city, country
M, 2:30:57.6, Harry Payne, GBR, 1929-07-05, Stamford Bridge, England
M, 2:5:42, Khalid Khannouchi, MAR, 1999-10-24, Chicago, USA
M, 2:5:37.8, Khalid Khannouchi, USA, 2002-04-14, London, USA

My task is to extract all lines that feature a runner's last name starting with a C.
What I did was; I sorted the file according to key 4 and redirected the output to a new file, which gave me the list sorted in alphabetical order according to their last name. I used this command to do this;
Code:
sort -k 4 marathon > t21a

.

Does anyone know if there is a command that allows me to grep all entries which have the last name starting with C from the key position these terms are in?
Any help would be much appreciated.

I have tried to grep from a key position but the option I used(from sort) does not exist for grep;
Code:
grep -k 4 '^[C,c]' t21b > t21c

Thank you Smilie

Last edited by Scrutinizer; 05-21-2018 at 08:59 AM.. Reason: code tags
# 2  
Old 05-21-2018
Hi,

awk can do that. Try:
Code:
awk '$4~/^[Cc]/' t21b

The field separator here is comma-space so I would be inclined to use that:
Code:
awk -F', *' '$3~/ [cC]/' t21b

Note that these approaches only work if all persons have exactly two names.

The following approach would use the last name in field 3:
Code:
awk -F', *' '{n=split($3,F," ")} F[n]~/^[cC]/ t21b

This User Gave Thanks to Scrutinizer For This Post:
# 3  
Old 05-21-2018
I'll try it out right now. Thank you Smilie

---------- Post updated at 12:16 PM ---------- Previous update was at 12:12 PM ----------

Heyyy it worked, thank you so much Smilie

---------- Post updated at 01:56 PM ---------- Previous update was at 12:16 PM ----------

Hey,
So I'm having issues sorting a data set.
The data set contains entries as such;

Code:
# key: sex, time, athlete, athlete's nationality, date, city, country

M, 2:30:57.6, Harry Payne, GBR, 1929-07-05, Stamford Bridge, England
M, 2:5:42, Khalid Khannouchi, MAR, 1999-10-24, Chicago, USA
M, 2:5:37.8, Khalid Khannouchi, USA, 2002-04-14, London, UK
M, 2:4:48, Patrick Makau Musyoki, KEN, 2010-04-11, Rotterdam, Netherlands

I now want to sort this file according to the name of the athlete's, however, I don't want to sort it alphabetically but according to the number of names the athletes have. So for instance, I would like to sort the entire document starting with the entries of all the athletes that have 2-word names, then all the athletes with 3-word names and so on. Is this possible using sort?

I have tried various commands;


Code:
sort -k 4-5|4-6 t22 > t22a

sort -k 4,5 | sort -k 4,6 t22 > t22a

sort -kb 4 t22 > t22a

sort -k 4 -b t22 > t22a

But I haven't managed to sort the data in the way I want it.
Does anyone know what I'm doing wrong?
Thanks in advance for your help.
Smilie
# 4  
Old 05-21-2018
Try introducing an extra column 1, that contains the number of words in column 3, then sorting on that column and then removing the extra column, for example:
Code:
awk -F', *' 'NR>1 && NF{print split($3,F," "), $0}' OFS='\t' t22 | sort -k1,1n | cut -f2-

NR>1 && NF skips the header and the empty line and split($3,F," ") produces the number of words in column 3

Last edited by Scrutinizer; 05-21-2018 at 05:08 PM..
This User Gave Thanks to Scrutinizer For This Post:
Login or Register to Ask a Question

Previous Thread | Next Thread

10 More Discussions You Might Find Interesting

1. Post Here to Contact Site Administrators and Moderators

Search for a pattern and replace a space at specific position with a Character in File

In file, we have millions of records each of 1000 in length. And at specific position say 800 there is a space, we need to replace it with Character X if the ID in that row starts with 123. So far i have used the below which is replacing space at that position to X but its not checking for... (3 Replies)
Discussion started by: Jagmeet Singh
3 Replies

2. Shell Programming and Scripting

sed and awk usage to grep a pattern 1 and with reference to this grep a pattern 2 and pattern 3

Hi , I have a file where i have modifed certain things compared to original file . The difference of the original file and modified file is as follows. # diff mir_lex.c.modified mir_lex.c.orig 3209c3209 < if(yy_current_buffer -> yy_is_our_buffer == 0) { --- >... (5 Replies)
Discussion started by: breezevinay
5 Replies

3. Shell Programming and Scripting

Grep a file pattern in another

Hi I'm new to the forum, so I'd apologize for any error in the format of the post. I'm trying to find a file content in another one using: grep -w -f file1 file2 file1 GJA7 TSC file 2 GJC1 GJA7 TSC1 TSC (11 Replies)
Discussion started by: flyfisherman
11 Replies

4. Shell Programming and Scripting

Grep the 5th and 6th position character of a word in a file

I am trying to find/grep the 5th and 6th position character (TX) of a word in a file. I tried to do grep "....TX" file The output gives me everything in the file with TX in it. I only need the output with the TX in the 5th and 6th position of the word. Any idea Example: test1 car... (5 Replies)
Discussion started by: e_mikey_2000
5 Replies

5. UNIX for Dummies Questions & Answers

find the file names having specified pattern at specified position in the current directory

I would need a command for finding first 15000 of the file names whose 25th postion is 5 in the current directory alone. I do have this painful command find . -name '5*' | head -15000 | cut -c3- please refine this. Of course the above command also searches in the sub directories... (3 Replies)
Discussion started by: vk39221
3 Replies

6. Shell Programming and Scripting

Find the position of a pattern on a line from a csv file

hello I'm doing a unix program and i'm using many file csv.in each csv file the colums are separated by ";" I would like to know the position of a pattern. For example for a line yyyy, bbbb, cccc; ddddd;eeee. I will like for example by finding the position of the pattern "cccc" and the response is... (6 Replies)
Discussion started by: papis
6 Replies

7. UNIX for Dummies Questions & Answers

Grep in a file for a particular pattern in a particular position witihn the file

Assume I have a file with a lot of data sets like 123 abc 01 456 def 02 789 ghi and I only want to grep all that datasets from my file having the pattern '02' at the postion 9-10 to get only 456 def 02 So I could group the datsets into three files according to the position 9-10, one... (9 Replies)
Discussion started by: ABE2202
9 Replies

8. UNIX for Dummies Questions & Answers

Grep a pattern in gz file

I have a set of .gz files. I need to grep a pattern and need to find out the file in which that pattern occurs. zgrep in not available in my server.Any other options available for searching a pattern without unzipping the .gz files. (2 Replies)
Discussion started by: rprajendran
2 Replies

9. Shell Programming and Scripting

How to awk/sed/grep lines which contains a pattern at a given position

Dear friends I am new to linux and was trying to split some files userwise in our linux server. I have a data file of 156 continuous columns named ecscr final. I want the script to redirect all the lines containing a pattern of 7 digits to separate files. I was using grep to do that,... (2 Replies)
Discussion started by: anoopvraj
2 Replies

10. Solaris

Unusual error : KEY in LOCKED position ignoring debug enter sequence

Hi, This was very unusual situation I never came across & we have SUN engineers working on this. We have Sun Fire V240 Server, 2 1.5-GHz UltraSPARC IIIi CPU, Solaris 9 Installed During the boot-up it gives following message, <date> <server name> unix : KEY in LOCKED position ignoring... (5 Replies)
Discussion started by: sacrh
5 Replies
Login or Register to Ask a Question