Missing sequences in filenames


 
Thread Tools Search this Thread
Top Forums Shell Programming and Scripting Missing sequences in filenames
# 1  
Old 08-13-2014
Missing sequences in filenames

Hi,

Please would anybody help find the missing sequences in the filename of the files?
I have for example these files:
Code:
OOOAAAALOGS400001.txt
OOOAAAALOGS400002.txt
OOOAAAALOGS400003.txt
OOOBBBBLOGS40001.txt
OOOBBBBLOGS400002.txt
OOOBBBBLOGS400003.txt
OOOCCCCLOGS400001.txt
OOOCCCCLOGS400002.txt
OOOCCCCLOGS400004.txt

I want to find which file sequences are missing.
In the above files, for AAAA no files are missing,
eaither for BBBB. BUT for CCC the file sequence 400003 is missing, which is the file OOOCCCCLOGS400003.txt is missing.

I have done this till now, but i cannot make it work Smilie.

Code:
for file in *.txt
do
  name=${file%%[LOGS]*}
  echo $name
for i in $name

do

{ gawk '\
    {
        fn = $1
        seq = substr(fn,14)
        sub(".gz$","",seq)
        seq += 0
        if(seq>big) big = seq
        seen[seq] = fn
     }
}

 END \
    {
        st = en = ""
        for(off=1; off<=big; off++)
          if(seen[off] == "")
          {
              if( st != "") printf "found '%s' to '%s'\n", st, en
              st = en = ""
              printf "missing %d\n", off
          }
          else
          {
              if(st=="") st = seen[off]
              en = seen[off]
          }

#       if( st != "") printf "found '%s' to '%s'\n", st, en
    }'


OOO and LOGS in the filename are also constants, they do not change.
Please, can anybody help me with this?
Many thanks!

Last edited by vbe; 08-13-2014 at 12:31 PM.. Reason: please use code tags NOT icode ! Thanks
# 2  
Old 08-13-2014
For exactly your sample, try
Code:
awk '{T=substr($0,4,4); C=substr($0,12,6); if (T != TA) {CA=C;TA=T}; while (CA<=C) print "OOO"T"LOGS"CA++".txt"}' file
OOOAAAALOGS400001.txt
OOOAAAALOGS400002.txt
OOOAAAALOGS400003.txt
OOOBBBBLOGS400001.txt
OOOBBBBLOGS400002.txt
OOOBBBBLOGS400003.txt
OOOCCCCLOGS400001.txt
OOOCCCCLOGS400002.txt
OOOCCCCLOGS400003.txt
OOOCCCCLOGS400004.txt

# 3  
Old 10-09-2014
Hi,

Thank you for you reply.
I have tested this, but still it doesn't do what i want.

Please, anybody who can help me with this?

Thank you o lot!
# 4  
Old 10-09-2014
Try:
Code:
printf "%s\n" *.txt | awk -FLOGS '{A[$1]; B[$2]; C[$1,$2]} END{for(i in A) for(j in B) if (!((i,j) in C)) print i FS j}'

# 5  
Old 10-09-2014
Dear Scrutinizer

thanks for replying.

But it doesnt show up the missing sequences as it should.
For the logs below, for AAAA, no sequence is missing, all the sequences have no gap( 1,2,3 have not gap between)
Even for BBBB, as 5,6 and 7 have no gap between.
BUT, for CCCC, the sequence 9 is missing, because 7,8,10 have the 9 missing.

Code:
OOOAAAALOGS00001.txt 
OOOAAAALOGS00002.txt 
OOOAAAALOGS00003.txt 
OOOBBBBLOGS00005.txt 
OOOBBBBLOGS00006.txt 
OOOBBBBLOGS00007.txt 
OOOCCCCLOGS00007.txt 
OOOCCCCLOGS00008.txt 
OOOCCCCLOGS00010.txt

Please can u help me find the missing sequences, for every Letter (A, B and C etc) ?

Many thanks!

Last edited by Scrutinizer; 10-09-2014 at 11:06 AM..
# 6  
Old 10-09-2014
OK, gaps only. I thought you needed something else. Could you try this:
Code:
printf "%s\n" *.txt | awk -FLOGS '$1==P[1]{for(i=P[2]+1; i<$2+0; i++) printf "%s%05d\n", $1 FS, i} {split($0,P)}'

# 7  
Old 10-10-2014
Thank you a lot for your support, but this doesnt show anything.
I tried to modify it, but nothing Smilie

---------- Post updated at 08:16 AM ---------- Previous update was at 03:00 AM ----------

Please, would you help me on this?

Thank you!
Login or Register to Ask a Question

Previous Thread | Next Thread

10 More Discussions You Might Find Interesting

1. Red Hat

Yum - resolving missing dependencies that are not missing

I am trying to install VirtualBox on RHEL 5 but I need the 32 bit version for 32 bit Windows. When I run yum I get the following: sudo yum localinstall /auto/spvtg-it/spvss-migration/Software/VirtualBox-4.3-4.3.2_90405_el6-1.i686.rpm Loaded plugins: fastestmirror Setting up Local Package... (13 Replies)
Discussion started by: gw1500se
13 Replies

2. Shell Programming and Scripting

Escape Sequences

Hi Gurus, Escape sequences \n, \t, \b, \t, \033(1m are not working. I just practiced these escape sequences. It worked first. Later its not working. Also the command - echo inside the script editor shows as shaded by a color. Before that echo inside the script editor wont show like this.... (4 Replies)
Discussion started by: GaneshAnanth
4 Replies

3. SuSE

How to resolve missing missing dependencies with opensuse 11.3 and 12.3?

Hello, This is a programming question as well as a suse question, so let me know if you think I should post this in programming. I have an application that I compiled under opensuse 12.2 using g77-3.3/g++3.3. The program compiles and runs just fine. I gave the application to a colleague who... (2 Replies)
Discussion started by: LMHmedchem
2 Replies

4. Shell Programming and Scripting

Adding sequences to alignment

I would like to add the following references at the very beggining of all my files: Thus, the resulting file should look like this: Any help will be very much appreciated (6 Replies)
Discussion started by: Xterra
6 Replies

5. Shell Programming and Scripting

trimming sequences

My file looks like this: But I would like to 'trim' all sequences to the same lenght 32 characters, keeping intact all the identifier (>GHXCZCC01AJ8CJ) Would it be possible to use awk to perform this task? (2 Replies)
Discussion started by: Xterra
2 Replies

6. Programming

Trigraph sequences

Hi, i have read trigraph sequence in The C99 Draft (N869, 18 January, 1999) printf("Eh???/n"); will produce printf("Eh?\n"); what does that mean? i tried that but i am getting the same output i.e Eh???/n. what actually these tri graph characters are? any idea why ,when and... (1 Reply)
Discussion started by: MrUser
1 Replies

7. UNIX for Advanced & Expert Users

Deal with binary sequences

Hello, I have come across the necessity for me to deal with binary sequences and I had a few questions. 1- Does any UNIX scripting language provide any tool or command for converting text data to binary sequences? Example of binary sequence: "0x97 0x93 0x85 0x40 0xd5 0xd6 0xd7" 2- If I want... (2 Replies)
Discussion started by: Indalecio
2 Replies

8. Shell Programming and Scripting

copying image sequences

Hi I am running mac osx, I am trying to use the terminal to copy groups of files I have and images sequence that is say 5000 frames long. What I want to do is copy sections of this files sequence to individual shot folder. eg copy say BG_0654 to BG_0765 to shot one folder, and say... (1 Reply)
Discussion started by: jonson
1 Replies

9. Solaris

Available escape sequences

:) Hi, Can any one help me to find available escape sequences in UNIX shell programming? ( Like \n, \c etc,. in C or C++) Iam generating one report using one of the script, in that it is very much essential. Regards, LOVE (6 Replies)
Discussion started by: Love
6 Replies

10. Shell Programming and Scripting

AWK and hex sequences

for file in `seq 1 256`; do printf "\x$file -- $file" ; done ; printf "\n" produces the wrong output. I want to show the ascii codes but need to output a hexidecimal number sequence. I know I should use awk to do this but i'm not sure how cause I forget. what is the awk equivelant of seq... (5 Replies)
Discussion started by: JoeTheGuy
5 Replies
Login or Register to Ask a Question