Extracting sequential pattern


 
Thread Tools Search this Thread
Top Forums Shell Programming and Scripting Extracting sequential pattern
# 1  
Old 09-14-2016
Extracting sequential pattern

Hi,

Can someone advise/help me on how to write a script to extract sequential lines. I was able to find and get a script working to create permutations of the inputs, but that not what I want/need.

Code:
awk 'function perm(p,s,     i) {
       for(i=1;i<=n;i++)
         if(p==1)
           printf "%s%s\n",s,A[i]
         else
           perm(p-1,s A[i]", ")
     }
     {
       A[++n]=$1
     }
     END{
       perm(n)
     }' infile

Unfortunately, i dont understand the script well enough to made modification (not due to lack of trying). I need to extract 2 to 5 sequential lines/word patterns.

An illustration of what i need is as follows:

Eg.

inputfile.txt:
Code:
A
B
C
D
E
F
G

outputfile.txt:
Code:
A B
B C
C D
D E
E F
F G
A B C
B C D
C D E
D E F
E F G
A B C D
B C D E
C D E F
D E F G
A B C D E
B C D E F
C D E F G


Moderator's Comments:
Mod Comment More code tags please.

Last edited by zaxxon; 09-14-2016 at 03:02 AM..
# 2  
Old 09-14-2016
I don't see the point in taking from somewhere a program, which is completely unrelated to your problem, and hoping that it will mysteriously turn into a correct run.

You should at least show some effort, for example by outlining the idea for an algorithm to do this task. At this point, this has nothing to do with shell programming; it's just about programming in general. Once this is done, we can discuss solutions in how to turn the algorithm into a, say, shell script; and, don't forget in this case to indicate, whether you are looking for a solution in some particular shell (bash, ksh, zsh, posix shell,....), or if any shell would be fine, as long as it gets the job done.

BTW, while it is for sure possible and not too hard to write the whole thing in shell language, I would probably use a more convenient programming language, such as Ruby or Perl. In the end, it's a matter of taste.
# 3  
Old 09-14-2016
lol. Thanks for the advise. Truth be told, I do not know where to start but the code I put up was the best I could find that does something similar. Thats why I wanted to start from there.

Preferably I would want to use awk/grep as I used it to clean up the data, but if (as you mentioned) that I might be able to crack this faster in Perl, then time to brush up my perl then. Smilie

---------- Post updated at 03:25 PM ---------- Previous update was at 02:39 PM ----------

Code:
awk 'NR%3{printf "%s ",$0;next}{print;}' infile

the code above would allow me to 'combine' 3 sequential lines.
Can i extend this to make it iterative?
# 4  
Old 09-14-2016
It seems that you are already strugging with the *algorithm*, and not with the implementation.

Here is how I would approach the problem:

What you basically have, is an ordered list of items (A B C D E ..... ) and you want to generate all consecutive runs from it. For example, B C D is such a run. Also, you don't consider a single element (D) by itself as a "run".

If you think about the whole list being an array, a run can represented by two array indices, which are different and where the first index is lower than the other. In the example above, the run B C D can be - assuming that we start index count in the array by 1 - by the pair (2,4), because B is the second and D is the fourth element.

Since it is trivial to generate the actual list of elements, when you have the index pair as described above, your problem boils down to generate all such index pairs.

Assuming that your array contains N elements and N>1, you generate all pairs matching the above restrictions by two nested loops. Without focusing on a particular programming language, the algorithm can be sketched as

Code:
    for i from 1 to N-1
      for j from i+1 to N
        generate run (i,j)

Of course you need to generate your array before, but as this doesn't request some clever algorithm, I left out this part for brevity.

Now you have the algorithm, and you can turn it in any language of your choice, so the next step would be to choose the language. You *can* do it in awk, in the same way as you can fetch one bottle of milk from the grocer round the corner by using a truck, but there are plenty of languages around, and maybe the choice of the language is also influenced by what you are going to do with the data afterwards.

I personally would do it in Ruby or in Zsh, but others might consider Python or ksh or C++ or LISP instead. Take that language which you are familiar with, or which you are eager to learn.
This User Gave Thanks to rovf For This Post:
# 5  
Old 09-14-2016
rovf's discussion couldn't entice you to show up with some own ideas? Pity ...

For exactly the problem given in post#1, try
Code:
awk '
        {T[NR] = $1}

END     {for (i=2; i<=5; i++)
           for (j=1; j<=NR-i+1; j++)
             {for (k=j; k<j+i; k++) printf "%s ", T[k]
              printf RS
             }
        }' file
A B 
B C 
C D 
D E 
E F 
F G 
A B C 
B C D 
C D E 
D E F 
E F G 
A B C D 
B C D E 
C D E F 
D E F G 
A B C D E 
B C D E F 
C D E F G

Login or Register to Ask a Question

Previous Thread | Next Thread

10 More Discussions You Might Find Interesting

1. Shell Programming and Scripting

Search for sequential pattern

input file: 4 5 1 A 1 2 3 4 s 8 (input file can be many millions of lines long) I want to search the example input file above, and when I find 4 sequential rows with values of 1,2,3,4 return those values and the two previous ones. In this case it should return 1,A,1,2,3,4 I know... (8 Replies)
Discussion started by: cedenker
8 Replies

2. Shell Programming and Scripting

Searching for a pattern and extracting records related to that pattern

Hi there, Looking forward to your advice for the below: I have a file which contains 2 paragraphs related to a particular pattern. I have to search for those paragraphs from a log file and then print a particular line from those paragraphs. Sample: I have one file with the fixed... (3 Replies)
Discussion started by: danish0909
3 Replies

3. Shell Programming and Scripting

Extracting a certain pattern..

Hi All, Suppose i have 4 coloumns in a excel sheet. Col A Col B Col C Col D 123 time1 abc 8 231 time2 xyz 6 324 time3 abc 4 456 time4 xyz 3 132 time5 abc 2 I want the data of coloum A... (3 Replies)
Discussion started by: ankitknit
3 Replies

4. UNIX for Dummies Questions & Answers

Extracting substring between pattern only one time

Hello ifconfig return : eth0 Link encap:Ethernet HWaddr 11:24:1D:C1:99:BA inet addr:192.168.0.1 Bcast:192.168.0.255 Mask:255.255.255.0 UP BROADCAST RUNNING MULTICAST MTU:1500 Metric:1 RX packets:37307 errors:0 dropped:0 overruns:0 frame:0 ... (2 Replies)
Discussion started by: jcdole
2 Replies

5. UNIX for Dummies Questions & Answers

Extracting sub-string matching the pattern.

Hi, I have a string looks like the following: USERS 32767.9844 UNDOTBS1 32767.9844 SYSAUX 32767.9844 SYSTEM 32767.9844 EMS 8192 EMS 8192 EMS_INDEXES 4096 EMS_INDEXES 4096 8 rows selected. How do I extract a sub-string to get the expected output as following: EMS 8192 EMS_INDEXES 4096 ... (3 Replies)
Discussion started by: NetBear
3 Replies

6. Programming

Tool to simulate non-sequential disk I/O (simulate db file sequential read) in C POSIX

Writing a Tool to simulate non-sequential disk I/O (simulate db file sequential read) in C POSIX I have over the years come across the same issue a couple of times, and it normally is that the read speed on SAN is absolutely atrocious when doing non-sequential I/O to the disks. Problem being of... (7 Replies)
Discussion started by: vrghost
7 Replies

7. Shell Programming and Scripting

sequential to line sequential

Hi I have a file sequential way i.e. written in contineous mode and the Record Seperator is AM from which the record is seperated .Now to process I have to make line sequential,and more over record length is not same it varies as per the input address, AM1234563 John Murray 24 Old streeet old... (5 Replies)
Discussion started by: vakharia Mahesh
5 Replies

8. UNIX for Dummies Questions & Answers

extracting pattern from every line

My scenario: 1. textfle 2. every line similar to: "...____ your sister?is1are0am0Grammar point1_______ the chairs in..." 3. need to extract only the numbers in each line, eg 001 in the case above. Tried different GREP/Sed combinations but...here I am An output like that would be... (9 Replies)
Discussion started by: eldeingles
9 Replies

9. Shell Programming and Scripting

Extracting lines between 2 strings only if the pattern patches

Hi Friends, Please help me with the following problem: I have an xml file with the following lines: <init-param> <param-name>/default/directory</param-name> <param-value>default</param-value> </init-param> <init-param> ... (5 Replies)
Discussion started by: simran
5 Replies

10. Programming

Reading special characters while converting sequential file to line sequential

We have to convert a sequential file to a 80 char line sequential file (HP UX platform).The sequential file contains special characters. which after conversion of the file to line sequential are getting coverted into "new line" or "tab" and file is getting distorted. Is there any way to read these... (2 Replies)
Discussion started by: Rajeshsu
2 Replies
Login or Register to Ask a Question