I want to search the example input file above, and when I find 4 sequential rows with values of 1,2,3,4 return those values and the two previous ones.
In this case it should return
I know this can be done on various platforms, but I'd like to use awk in this case. I'm fairly certain I'll end up using a six element array, but y'all will probably figure this out before I do. Thanks in advance, brain too old to figure this stuff out anymore...
---------- Post updated at 07:47 PM ---------- Previous update was at 04:47 PM ----------
I started down the path of using grep to pull out the rows that I need, 2 before the match and 3 after the match. I was going to simply the match to only finding the first entri that i needed, and filter the extra ones out later. After that is was a simple matter of formatting. That is, until the case where we had matching overlaps, like so.
Say I'm looking for rows with 1,2,3,4 - then I was only going to grep on "1", and extract the leading and following rows. Even if I got alot of entries that were not a perfect match, I can easily filter those out. Here is the case that ruined it.
The grep will misbehave because it refuses to grep the value "1" more than once. In this case the "1" relates to the before part of one selection, and the after part of another, and it only reports it once. So unless there is a way of telling grep to not do this, can't use grep....
Moderator's Comments:
Please wrap all code, files, input & output.errors in CODE tags.
It makes it far easier to read and preserves multiple spaces for indenting or fixed-width data.
Last edited by rbatte1; 09-23-2016 at 08:30 AM..
Reason: Added CODE tags
I can sort of follow this. Is is hardcoded to use "1,2,3,4" for the search criteria? Or at least 4 sequential numbers?
I need to have a little flexibility in selecting the 4 values to search for (I used 1,2,3,4 just as an oversimplified example).
I have confirmed that it works great for 1,2,3,4.......
Thanks for the first response!
Last edited by cedenker; 09-23-2016 at 12:35 AM..
Reason: clarify my follow up question
If you are looking for different strings (not "1" thru "4") a slightly different solution is required:
This version now searches for "one", "two", "three" and then "four" and can be easily converted to search for you list of specific strings. The split command is building an array M[] which is used to match each line.
This User Gave Thanks to Chubler_XL For This Post:
initial test works fine. Let me add some of the other things I oversimplified into the script and see if I can break it. Thanks!
---------- Post updated 09-23-16 at 12:16 AM ---------- Previous update was 09-22-16 at 11:04 PM ----------
I should have made this part of the initial requirement, but thought I could add it in myself after the original problem was solved. I can't wrap my head what the script is actually doing, so can't really add to it unfortunately.
The additional requirement is as follows.
Extra column in the input file.
the additional output would be the value in column 1 for the initial row of the match. In this case the output (looking for one,two,three,four) should be.
So I understood enough to read $2 instead of $0, and the script works the same now, just basically ignoring the first of the two input columns. I'm assuming all we need is a 2nd array to store the first column values, updating itself at the same time the 1st array updates. Then when it comes time to print out, just print the first array element of the 1st column.
I should have included this in the initial requirement, sorry about that....
Moderator's Comments:
Please use CODE tags as required by forum rules!
Last edited by RudiC; 09-23-2016 at 06:03 AM..
Reason: Added CODE tags.
For instance (assuming every line consist of exactly one character, as in your example, and that the line terminator is just a newline character), the following command would work:
Last edited by rovf; 09-23-2016 at 04:01 AM..
Reason: Removing unnecessary -E switch
Let's say our Input_file is as follows, where I am considering that strings one,twoetc could come at any order.
Then following will be the code.
Output will be as follows. EDIT: Adding a non-one liner form of solution too now.
So it is taking care of rule like strings one,two,three,fourshould come consecutive and if they are less than their count 4 it shouldn't print those too. Please do let us know how it goes and if this helps you. EDIT2: Improving above code by removing array A inside whileloop.
Thanks,
R. Singh
Last edited by RavinderSingh13; 09-23-2016 at 06:30 AM..
Reason: Adding a non-one liner form of solution too now.
This User Gave Thanks to RavinderSingh13 For This Post:
I have this fileA
TEST FILE ABC
this file contains ABC;
TEST FILE DGHT this file contains DGHT;
TEST FILE 123
this file contains ABC,
this file contains DEF,
this file contains XYZ,
this file contains KLM
;
I want to have a fileZ that has only (begin search pattern for will be... (2 Replies)
Hi,
Can someone advise/help me on how to write a script to extract sequential lines. I was able to find and get a script working to create permutations of the inputs, but that not what I want/need.
awk 'function perm(p,s, i) {
for(i=1;i<=n;i++)
if(p==1)
... (4 Replies)
Hi,
I have two files file1.txt and file2.txt. Please see the attachments.
In file2.txt (which actually is a diff output between two versions of file1.txt.), I extract the pattern corresponding to 1172c1172. Now ,In file1.txt I have to search for this pattern 1172c1172 and if found, I have to... (9 Replies)
I am trying to search a file for a patterns ERR- in a file and return a count for each of the error reported
Input file is a free flowing file without any format
example of output
ERR-00001=5
....
ERR-01010=10
.....
ERR-99999=10 (4 Replies)
Writing a Tool to simulate non-sequential disk I/O (simulate db file sequential read) in C POSIX
I have over the years come across the same issue a couple of times, and it normally is that the read speed on SAN is absolutely atrocious when doing non-sequential I/O to the disks. Problem being of... (7 Replies)
Hi all,
I am trying to extract the values ( text between the xml tags) based on the Order Number.
here is the sample input
<?xml version="1.0" encoding="UTF-8"?>
<NJCustomer>
<Header>
<MessageIdentifier>Y504173382</MessageIdentifier>
... (13 Replies)
I'll try explain this as best I can. Let me know if it is not clear.
I have large text files that contain data as such:
143593502 09-08-20 09:02:13 xxxxxxxxxxx xxxxxxxxxxx 09-08-20 09:02:11 N line 1 test
line 2 test
line 3 test
143593503 09-08-20 09:02:13... (3 Replies)
I am trying to do some thing like this ..
In a file , if pattern found insert new pattern at the begining of the line containing the pattern.
example:
in a file I have this.
gtrow0unit1/gctunit_crrownorth_stage5_outnet_feedthru_pin
if i find feedthru_pin want to insert !! at the... (7 Replies)
We have to convert a sequential file to a 80 char line sequential file (HP UX platform).The sequential file contains special characters. which after conversion of the file to line sequential are getting coverted into "new line" or "tab" and file is getting distorted. Is there any way to read these... (2 Replies)