Location: Saint Paul, MN USA / BSD, CentOS, Debian, OS X, Solaris
Posts: 2,288
Thanks Given: 430
Thanked 480 Times in 395 Posts
Hi.
Quote:
Originally Posted by jawsnnn
I was under the impression that the string (gender) was part of your sample file. ... PS: If your record is always comprised of 4 rows, Scrutinizer's solution is better since it runs only one process.
Observation:
That impression is why it is so important for questions to include a representative sample of data.
I prefer modular solutions because I can often use the parts in other solutions. Here is a script that packages-up 4 lines into a "super line", searches line 2 (field 2 in the super line), and then unpacks. It uses "@" as the stand-in for the embedded newlines. There are two parts, one where the awk is coded directly, the other is done with shell functions (usable because of how concise awk can be -- usable, but the shell function is not very readable):
producing:
Intermediate results can be seen from the files that tee produces.
The bundling idea came from https://www.unix.com/shell-programmin...ting-code.html although there are likely other sources as well. The shell search function is more for illustration. I think a directly coded awk script in the pipeline "sandwich" would be preferable.
I think that a general bundle utility would be very useful: two modes, one collecting lines as done here, another with strings (tokens) as done with xargs in the post cited.
Your suggestion is the closest to what I am looking for so far. Let's have a look at the how I have applied to the same example:
This is great so there is no need to use SED altogether. However, is it possible to include the record separator as part of the record as well? i.e. Jessica Smith 16/09/2000 Female.
Just to make the sample data a little more closer to the actual data which I am not able to review due to confidentiality reason which also include the pipe ”|” character as part of the record separator. For instance:
As a result, is it still possible to identify the last using RS=” L\|Employee.gender\|Male| L\|Employee.gender\|Female”, some way to search the second field ($2) of record separator instead of the whole line which is made up of pipe separated fields?
Apologies for coming up with a new record format which may annoy some moderators but we are very close to wrapping up this thread.
So I take it you have gawk installed on your Solaris box, and that is what you are using, because this will absolutely not work with the standard awks on Solaris....
Quote:
Originally Posted by gjackson123
[..]
Just to make the sample data a little more closer to the actual data which I am not able to review due to confidentiality reason which also include the pipe ”|” character as part of the record separator.
[..]
Apologies for coming up with a new record format which may annoy some moderators but we are very close to wrapping up this thread.
[..]
Apologies, yet you still are not providing a representative sample.
Please do not leave people guessing. Show a representative (anonymized) sample of your input and desired output or this thread will need to be closed.
I have a file, named records.txt, containing large number of records, around 0.5 million records in format below:
28433005 1 1 3 2 2 2 2 2 2 2 2 2 2 2
28433004 0 2 3 2 2 2 2 2 2 1 2 2 2 2
...
Another file is a key file, named key.txt, which is the list of some numbers in the first column of... (5 Replies)
Hi All
I would like to modify a file like this:
>antax gioq21 tris notes
abcdefghij
klmnopqrs
>betax gion32 ter notes2
tuvzabcdef
ahgskslsooin this:
>tris
abcdefghij
klmnopqrs
>ter
tuvzabcdef
ahgskslsoo
So, I would like to remove the first two fields(and output field 3) in record... (4 Replies)
Friends,
I have data sorted on id like this
id addressl
1 abc
2 abc
2 abc
2 abc
3 aabc
4 abc
4 abc
I want to pick all ids with addressesses leaving out duplicate records. Desired output would be
id address
1 abc
2 abc
3 abc
4 abc (5 Replies)
Print only records from file 2 that do not match file 1 based on criteria of comparing column 1 and column 6
Was trying to play around with following code I found on other threads but not too successful
Code:
awk 'NR==FNR{p=$1;$1=x;A=$0;next}{$2=$2(A?A:",,,")}1' FS=~ OFS=~ file1 FS="*"... (11 Replies)
Hello:
I am new to shell script programming. Now I would like to select specific records block from a file. For example, current file "xyz.txt" is containing 1million records and want to select the block of records from line number 50000 to 100000 and save into a file. Can anyone suggest me how... (3 Replies)
i have a table
records
------------
id | user | time | event
91 admin | 12:00 | hi
92 admin | 11:00 | hi
93 admin | 12:00 | bye
94 admin | 13:00 | bye
95 root | 12:00 | hi
96 root | 12:30 | hi
97 root | 12:56 | hi
how could i only select and display only the user and event from... (6 Replies)
Hi everyone.
I am a newbie to Linux stuff. I have this kind of problem which couldn't solve alone. I have a text file with records separated by empty lines like this:
ID: 20
Name: X
Age: 19
ID: 21
Name: Z
ID: 22
Email: xxx@yahoo.com
Name: Y
Age: 19
I want to grep records that... (4 Replies)
Dear list
its my first post and i would like to greet everyone
What i would like to do is select records 7 and 11 from each files in a folder then run an executable inside the script for the selected parameters.
The file format is something like this
7 100 200
7 100 250
7 100 300 ... (1 Reply)
As part of a bigger task, I had to read thru a file and separate records into various batches based on a field. Specifically, separate records based on the value in the batch field as defined below. The batch field left-justified numbers.
The datafile is here
> cat infile
12345 1 John Smith ... (5 Replies)
Hi All,
I need to select only those records having a non zero record in the first column of a comma delimited file.
Suppose my input file is having data like:
"0","01/08/2005 07:11:15",1,1,"Created",,"01/08/2005"
"0","01/08/2005 07:12:40",1,1,"Created",,"01/08/2005"... (2 Replies)