|
|||||||
| Forums | Search Forums | Register | Forum Rules | Man Pages | Albums | FAQ | Members | Calendar | Search | Today's Posts | Mark Forums Read |
| UNIX for Dummies Questions & Answers If you're not sure where to post a UNIX or Linux question, post it here. All UNIX and Linux newbies welcome !! |
|
|
|
Thread Tools | Search this Thread | Display Modes |
|
#1
|
|||
|
|||
|
Hi All, I am looking for a simple concise solution most likely using sed to process the following 4 rows of data from the same record and only keeps it if the second record satisfy certain critea such as surname matches up to smith or jackson: Code:
John (firstname) Smith (surname) 20/05/1984 (dob) Male (gender) It would have been possible to use AWK if the data are on the same line with a fixed delimiter. There is no problem writing many lines of shell scripting but I am hoping to find an easy brief solution in SED but not familiar with how it could be done. I am running on Solaris 10 x86 platform. Your assistance would be much appreciated, George Last edited by gjackson123; 06-18-2012 at 09:36 AM.. Reason: Tidy up code & provide platform detail |
| Sponsored Links | ||
|
|
#2
|
|||
|
|||
|
Can you please provide a sample input file and intended output file.... I am a bit confused about what you are assuming to be a record in your file. Based on my understanding ( that a record will always be the combination of the above 4 rows, and the second row in the above set should begin with 'Smith' to be selected), here is my solution: Code:
sed 's/(gender)/&*/g' file1 | awk -F'\n' '$2 ~ /^Smith.*/ {print}' RS='*'Note: This solution assumes that '*' does not appear anywhere in your data. Replace it with another character (which does not occur in your data) if this is not the case. Last edited by jawsnnn; 06-18-2012 at 09:50 AM.. |
| Sponsored Links | ||
|
|
#3
|
|||
|
|||
|
Turning to SED to select specific records
Hi jawsnnn,Thanks for your valuable input. There is no need to provide sample input data file since your understanding of its composition is correct as shown from this initial post. Nevertheless, I am wondering whether if you could provide a brief one liner explanation on how your code would work since my SED knowledge is limited. Also, which of the following minor updates would accommodate for more than one surname: Code:
sed 's/(gender)/&*/g' file1 | awk -F'\n' '$2 ~ /^Smith.*|^Jone.*|^Green.*/ {print}' RS='*'
or
sed 's/(gender)/&*/g' file1 | awk -F'\n' '$2 ~ /^(Smith|Jone|Green).*/ {print}' RS='*'
or
sed 's/(gender)/&*/g' file1 | awk -F'\n' '$2 ~ /^(?:Smith|Jone|Green).*/ {print}' RS='*'I will test out each of these statements to see which one work and let you know. Thanks again, George |
|
#4
|
|||
|
|||
|
I think the first variation should work fine for multiple surnames. Let me explain the solution: Code:
sed 's/(gender)/&*/g' file1 | awk -F'\n' '$2 ~ /^Smith.*/ {print}' RS='*'1. I appended an asterisk '*' to the string (gender), i.e. the end of your record using sed by using Code:
sed 's/(gender)/&*/g' Here & is replaced by the matched string. 2. Then I divide the output of this command into records separated by '*' with fields separated by '/n' or newline character. This enables me to treat the four lines in each set as four different fields in the awk command. I achieve this by setting two variables: Code:
RS='*' and -F='\n' 3. Then, I simply match the second field (i.e. the second row of all sets) to the pattern Code:
^Smith.* which matches fields starting with the string Smith followed by any character. In retrospect, the .* in the pattern is probably not needed. |
| Sponsored Links | |
|
|
#5
|
||||
|
||||
|
Hi, gjackson123.
Meta-advice. If one were to want more than one suggested solution, one would supply sample data. That allows consistency among results. Otherwise, you are putting an additional burden on the responders to come up with sample data, which, in addition to being likely different from one another, may not be representative of the real set. In general, if faced with the task of creating sample data in addition to a solution, then I probably will move on to other questions without attempting to solve the problem. Best wishes ... cheers, drl Last edited by drl; 06-22-2012 at 10:51 AM.. |
| The Following User Says Thank You to drl For This Useful Post: | ||
Scrutinizer (06-21-2012) | ||
| Sponsored Links | |
|
|
#6
|
|||
|
|||
|
Turning to SED to select specific records
Hi jawsnnn & drl,Below is the employee.txt as requested: $ more employee.txt Code:
John Barry 21/04/1988 Male Jessica Smith 16/09/2000 Female Joyce Brown 05/12/1985 Female Kyle Jones 02/10/1945 Male Code:
$ sed 's/(gender)/&*/g' employee.txt | more John Barry 21/04/1988 Male Jessica Smith 16/09/2000 Female Joyce Brown 05/12/1985 Female Kyle Jones 02/10/1945 Male It doesn’t look like the sed statement is doing anything with it. Should the (gender) be replaced with something else? What should I expect the data to look like out of sed and into awk which I am more comfortable with? I am interested getting a solution with all everyone’s help. Thanks again, George ---------- Post updated 06-22-12 at 12:19 AM ---------- Previous update was 06-21-12 at 06:22 PM ---------- Hi, Below are some more attempts to figure out how your SED & AWK statements work: Code:
$ uname -a SunOS startrek 5.10 Generic_141444-09 sun4v sparc SUNW,SPARC-Enterprise-T5220 Code:
$ more employee.txt John Barry 21/04/1988 Male Jessica Smith 16/09/2000 Female Joyce Brown 05/12/1985 Female Kyle Jones 02/10/1945 Male ## Returned the same list & order Code:
$ sed 's/(gender)/&*/g' employee.txt John Barry 21/04/1988 Male Jessica Smith 16/09/2000 Female Joyce Brown 05/12/1985 Female Kyle Jones 02/10/1945 Male ## Returned the same list & order Code:
$ sed 's/(Male)/&*/g' employee.txt John Barry 21/04/1988 Male Jessica Smith 16/09/2000 Female Joyce Brown 05/12/1985 Female Kyle Jones 02/10/1945 Male ## Returned the same list & order Code:
$ sed 's/(Female)/&*/g' employee.txt John Barry 21/04/1988 Male Jessica Smith 16/09/2000 Female Joyce Brown 05/12/1985 Female Kyle Jones 02/10/1945 Male ## Awk is not getting the right output from SED Code:
$ sed 's/(Male)/&*/g' employee.txt | awk -F'\n' '$2 ~ /^Smith.*/ { print }' RS='*'
$## Same input to AWK as from SED Code:
$ awk -F'\n' '$2 ~ /^Smith.*/ { print }' RS='*' employee.txt
$I suspect the problem is from Code:
sed 's/(gender)/&*/g' but I am still trying to wrap my head around it. Also, what is the purpose of the round brackets () around gender, & and *? The sed statement appears to be doing a global replacement of (gender) with &* even though I not clear whether the gender should be replaced with something else? Thanks a lot, George Last edited by gjackson123; 06-22-2012 at 01:27 AM.. Reason: Cleaned out spurious formatting |
| Sponsored Links | |
|
|
#7
|
||||
|
||||
|
Perhaps this is your requirement : Code:
$ cat input
John
Barry
21/04/1988
Male
Jessica
Smith
16/09/2000
Female
Joyce
Brown
05/12/1985
Female
Kyle
Jones
02/10/1945
Male
$ awk 'BEGIN { RS="Male|Female" } { print $1,$2,$3 } ' input
John Barry 21/04/1988
Jessica Smith 16/09/2000
Joyce Brown 05/12/1985
Kyle Jones 02/10/1945 |
| The Following User Says Thank You to Peasant For This Useful Post: | ||
gjackson123 (06-30-2012) | ||
| Sponsored Links | ||
|
![]() |
| Thread Tools | Search this Thread |
| Display Modes | |
More UNIX and Linux Forum Topics You Might Find Helpful
|
||||
| Thread | Thread Starter | Forum | Replies | Last Post |
| awk print only select records from file2 | sigh2010 | Shell Programming and Scripting | 11 | 08-09-2011 02:02 PM |
| Block of records to select from a file | nvkuriseti | Shell Programming and Scripting | 3 | 04-20-2011 05:54 AM |
| Grep specific records from a file of records that are separated by an empty line | Atrisa | UNIX for Dummies Questions & Answers | 4 | 12-14-2010 05:51 AM |
| Using a variable to select records with awk | joeyg | Shell Programming and Scripting | 5 | 09-26-2008 10:48 AM |
| Select records based on search criteria on first column | shashi_kiran_v | UNIX for Dummies Questions & Answers | 2 | 12-02-2005 12:49 PM |
|
|