How to find untagged audio files?


 
Thread Tools Search this Thread
Top Forums UNIX for Dummies Questions & Answers How to find untagged audio files?
# 22  
Old 12-22-2010
Test this one:

Code:
find . -type f -name '*.[Mm][Pp]3' -exec id3 -Rl {} + -o \
  -name '*.[Ff][Ll][Aa][Cc]' -exec metaflac --show-md5sum  \
    --with-filename --export-tags-to=- {} + |
  awk 'BEGIN {
    mp3n        = split("album artist title track",       mp3_tags)
    flacn       = split("album artist title tracknumber", flac_tags)
    ignorepatt  = "^ *(unknown|track *[0-9]*)* *$"    
    }
  $1 ~ /\.[Ff][Ll][Aa][Cc]$/ {
    if (fn) {
      for (i = 1; i <= flacn; i++) {
        if (tolower(tags[flac_tags[i]]) ~ ignorepatt) {
          invalid_tags[flac_tags[i]] = tags[flac_tags[i]]
          f || f++
          }
        }
      if (f) {
       print RS, fn, "has missing/invalid tags:"
        for (t in invalid_tags)
          printf "%s -> %s\n", t, invalid_tags[t]
        }      
      }    
    split(x, tags); split(x, invalid_tags)
    fn = $1; f = x
    }
  /^Filename/ {
    if (fn) {
      for (i = 1; i <= mp3n; i++) {
        if (tolower(tags[mp3_tags[i]]) ~ ignorepatt) {
          invalid_tags[mp3_tags[i]] = tags[mp3_tags[i]]
          f || f++
          }
        }
      if (f) {
       print RS, fn, "has missing/invalid tags:"
        for (t in invalid_tags)
          printf "%s -> %s\n", t, invalid_tags[t]
        }      
      }    
    split(x, tags); split(x, invalid_tags)
    fn = $2; f = x
    }
 { 
    if (split($0, tmp, "=") == 2) {
      $1 = tmp[1]; $2 = tmp[2]
      }
    tags[tolower($1)] = $2
    }
  END {
      if (fn ~ /\.[Ff][Ll][Aa][Cc]$/) {
        for (i = 1; i <= flacn; i++) {
          if (tolower(tags[flac_tags[i]]) ~ ignorepatt) {
            invalid_tags[flac_tags[i]] = tags[flac_tags[i]]
            f || f++
            }
          }
        if (f) {
          print RS, fn, "has missing/invalid tags:"
          for (t in invalid_tags)
            printf "%s -> %s\n", t, invalid_tags[t]
            }      
          }    
    else       
      for (i = 1; i <= mp3n; i++) {
        if (tolower(tags[mp3_tags[i]]) ~ ignorepatt) {
          invalid_tags[mp3_tags[i]] = tags[mp3_tags[i]]
          f || f++
          }
      if (f) {
        print RS, fn, "has missing/invalid tags:"
          for (t in invalid_tags)
          printf "%s -> %s\n", t, invalid_tags[t]
          }            
        }  
      }' FS=:

# 23  
Old 12-22-2010
Yes ,this works. Thanks. Let's see if I understand what's happening. (Hmm.. there still is a little hickup. See the bottom for that.)

Code:
find . -type f -name '*.[Mm][Pp]3' -exec id3 -Rl {} + -o \
  -name '*.[Ff][Ll][Aa][Cc]' -exec metaflac --show-md5sum  \
    --with-filename --export-tags-to=- {} + |

find all mp3 and flac files, get the info and pipe the stream
Code:
 awk 'BEGIN {
    mp3n        = split("album artist title track",       mp3_tags)
    flacn       = split("album artist title tracknumber", flac_tags)
    ignorepatt  = "^ *(unknown|track *[0-9]*)* *$"    
    }

Define variables. What does the ^ mean in the pattern? After that it's any amount of spaces followed by unknown or track followed by any amount of spaces and digits. Followed by any amount of spaces till the end of line.
Code:
  $1 ~ /\.[Ff][Ll][Aa][Cc]$/ {

If any .flac is found at the end of a line do the following. Or .FLAC, .fLaC. Any combination of upper and lower case.
Code:
    if (fn) {

What is fn?
Code:
      for (i = 1; i <= flacn; i++) {
        if (tolower(tags[flac_tags[i]]) ~ ignorepatt) {
          invalid_tags[flac_tags[i]] = tags[flac_tags[i]]
          f || f++
          }
        }

Test all tags and remeber the illegal ones.
Is 'f || f++' just a complex way to say 'f=1'?
Code:
      if (f) {
       print RS, fn, "has missing/invalid tags:"
        for (t in invalid_tags)
          printf "%s -> %s\n", t, invalid_tags[t]
        }      
      }

If found illegal tags, then sent them to the output stream. Why use print one tim and the other time prinf?
Code:
    split(x, tags); split(x, invalid_tags)
    fn = $1; f = x
    }

I don't understand. I thought we were done for flac.
Code:
  /^Filename/ {
    if (fn) {
      for (i = 1; i <= mp3n; i++) {
        if (tolower(tags[mp3_tags[i]]) ~ ignorepatt) {
          invalid_tags[mp3_tags[i]] = tags[mp3_tags[i]]
          f || f++
          }
        }
      if (f) {
       print RS, fn, "has missing/invalid tags:"
        for (t in invalid_tags)
          printf "%s -> %s\n", t, invalid_tags[t]
        }      
      }    
    split(x, tags); split(x, invalid_tags)
    fn = $2; f = x
    }

Same but for mp3. I guess next step is to combine this two pieces of code to one.
Code:
 { 
    if (split($0, tmp, "=") == 2) {
      $1 = tmp[1]; $2 = tmp[2]
      }
    tags[tolower($1)] = $2
    }

What does this mean?
Code:
  END {

Aha.. we are finshed.
Code:
      if (fn ~ /\.[Ff][Ll][Aa][Cc]$/) {
        for (i = 1; i <= flacn; i++) {
          if (tolower(tags[flac_tags[i]]) ~ ignorepatt) {
            invalid_tags[flac_tags[i]] = tags[flac_tags[i]]
            f || f++
            }
          }
        if (f) {
          print RS, fn, "has missing/invalid tags:"
          for (t in invalid_tags)
            printf "%s -> %s\n", t, invalid_tags[t]
            }      
          }    
    else       
      for (i = 1; i <= mp3n; i++) {
        if (tolower(tags[mp3_tags[i]]) ~ ignorepatt) {
          invalid_tags[mp3_tags[i]] = tags[mp3_tags[i]]
          f || f++
          }
      if (f) {
        print RS, fn, "has missing/invalid tags:"
          for (t in invalid_tags)
          printf "%s -> %s\n", t, invalid_tags[t]
          }            
        }  
      }' FS=:

We checked, we sent everything to the output, so what is the need for almost the same code again? But what is FS=:?


Oh.. I noticed something, It's not really perfect yet. It seems to go wrong when switching to antother format:
Code:
  ./alex roeka - 1999  - wildernis/01 - Het Schip Genaamd 'De Nacht'.mp3 has missing/invalid tags:
track -> 

  ./alex roeka - 1999  - wildernis/06 - Noem 'T Geen Liefde.mp3 has missing/invalid tags:
title ->  Track01                       

  ./alex roeka - 1999  - wildernis/08 - De Mannenwoestijn.mp3 has missing/invalid tags:
tracknumber -> 

 ./1988 You Can't Do That On The Stage Anymore, Volume 1/Volume 1 Disc 1/03. Sofa #1.flac has missing/invalid tags:
title -> track 1

From Alex Roeka track 08 is correctly filled, but I still get a message that the tracknumber isn't filled, which is a flac tag, not a mp3 tag. See correctly detected track 01. The property is called 'track' not 'tracknumber'
And If I test with all tags filled correctly. This track is still being wrongly accused:
Code:
  ./alex roeka - 1999  - wildernis/08 - De Mannenwoestijn.mp3 has missing/invalid tags:
tracknumber ->

Well actually it is correctly detected that tracknumber is missing, because it isn't there. That is because tracknumber shouldn't be in a mp3. There just should be no check for tracknumber at all. For mp3 then. For flac of course there should be.

Last edited by MrZehl; 12-22-2010 at 07:24 AM..
# 24  
Old 12-22-2010
OK, let's try to correct the code first, I'll explain the code after that.

Try this:
Code:
find . -type f -name '*.[Mm][Pp]3' -exec id3 -Rl {} + -o \
  -name '*.[Ff][Ll][Aa][Cc]' -exec metaflac --show-md5sum  \
    --with-filename --export-tags-to=- {} + |
  awk 'BEGIN {
    mp3n        = split("album artist title track",       mp3_tags)
    flacn       = split("album artist title tracknumber", flac_tags)
    ignorepatt  = "^ *(unknown|track *[0-9]*)* *$"    
    }
  /^Filename/ || $1 ~ /\.[Ff][Ll][Aa][Cc]$/ {
    if (fn) {
      if (fn ~ /\.[Ff][Ll][Aa][Cc]$/) {
        for (i = 1; i <= flacn; i++) {
          if (tolower(tags[flac_tags[i]]) ~ ignorepatt) {
            invalid_tags[flac_tags[i]] = tags[flac_tags[i]]
            f || f++
            }
          }
        if (f) {
          print RS, fn, "has missing/invalid tags:"
          for (t in invalid_tags)
            printf "%s -> %s\n", t, invalid_tags[t]
            }      
          }    
     else { 
       for (i = 1; i <= mp3n; i++) {
         if (tolower(tags[mp3_tags[i]]) ~ ignorepatt) {
           invalid_tags[mp3_tags[i]] = tags[mp3_tags[i]]
           f || f++
             }
           }
       if (f) {
         print RS, fn, "has missing/invalid tags:"
         for (t in invalid_tags)
           printf "%s -> %s\n", t, invalid_tags[t]
          } 
        }
      }
    split(x, tags); split(x, invalid_tags)
    fn = /^Filename/ ? $2 : $1; f = x        
     }
 { 
    if (split($0, tmp, "=") == 2) {
      $1 = tmp[1]; $2 = tmp[2]
      }
    tags[tolower($1)] = $2
    }
  END {
       if (fn ~ /\.[Ff][Ll][Aa][Cc]$/) {
        for (i = 1; i <= flacn; i++) {
          if (tolower(tags[flac_tags[i]]) ~ ignorepatt) {
            invalid_tags[flac_tags[i]] = tags[flac_tags[i]]
            f || f++
            }
          }
       if (f) {
         print RS, fn, "has missing/invalid tags:"
         for (t in invalid_tags)
           printf "%s -> %s\n", t, invalid_tags[t]
          }      
        }    
    else  
      for (i = 1; i <= mp3n; i++) {
        if (tolower(tags[mp3_tags[i]]) ~ ignorepatt) {
          invalid_tags[mp3_tags[i]] = tags[mp3_tags[i]]
          f || f++
          }
        }
      if (f) {
       print RS, fn, "has missing/invalid tags:"
        for (t in invalid_tags)
          printf "%s -> %s\n", t, invalid_tags[t]
        }      
      }' FS=:

This User Gave Thanks to radoulov For This Post:
# 25  
Old 12-22-2010
I can't find anything not working perfectly on this one.
It's just working fine.
# 26  
Old 12-22-2010
OK, now let's clear the code a little bit:

Code:
find . -type f -name '*.[Mm][Pp]3' -exec id3 -Rl {} + -o \
  -name '*.[Ff][Ll][Aa][Cc]' -exec metaflac --show-md5sum  \
    --with-filename --export-tags-to=- {} + |
  awk 'BEGIN {
    mp3n        = split("album artist title track",       mp3_tags)
    flacn       = split("album artist title tracknumber", flac_tags)
    ignorepatt  = "^ *(unknown|track *[0-9]*)* *$"    
    }
  
  /^Filename/ || $1 ~ /\.[Ff][Ll][Aa][Cc]$/ {
    fn && check_tags(fn ~ /\.[Ff][Ll][Aa][Cc]$/ ? "flac" : x)
    fn = /^Filename/ ? $2 : $1; f = x        
     }
  
  { 
    if (split($0, tmp, "=") == 2) { $1 = tmp[1]; $2 = tmp[2] }
    tags[tolower($1)] = $2 
      }
  
  END { check_tags(fn ~ /\.[Ff][Ll][Aa][Cc]$/ ? "flac" : x) }
  
  func check_tags(ty,   f) {
    if (ty == "flac") {
      for (i = 1; i <= flacn; i++) {
        if (tolower(tags[flac_tags[i]]) ~ ignorepatt) {
          invalid_tags[flac_tags[i]] = tags[flac_tags[i]]; f || f++ 
          }
        }
      }    
    else {
      for (i = 1; i <= mp3n; i++) {
        if (tolower(tags[mp3_tags[i]]) ~ ignorepatt) {
          invalid_tags[flac_tags[i]] = tags[flac_tags[i]]; f || f++
          }
        }
      }  
    if (f) {
      print RS, fn, "has missing/invalid tags:"
        for (t in invalid_tags)
          printf "%s -> %s\n", t, invalid_tags[t]
       }
    split(x, tags); split(x, invalid_tags)   
    }' FS=:

# 27  
Old 12-22-2010
Still running fine here.
# 28  
Old 12-22-2010
Good Smilie

---------- Post updated at 01:43 PM ---------- Previous update was at 01:26 PM ----------

Quote:
Originally Posted by MrZehl
Let's see if I understand what's happening
[...]

Code:
find . -type f -name '*.[Mm][Pp]3' -exec id3 -Rl {} + -o \
  -name '*.[Ff][Ll][Aa][Cc]' -exec metaflac --show-md5sum  \
    --with-filename --export-tags-to=- {} + |

find all mp3 and flac files, get the info and pipe the stream
Correct.

Quote:
Code:
 awk 'BEGIN {
    mp3n        = split("album artist title track",       mp3_tags)
    flacn       = split("album artist title tracknumber", flac_tags)
    ignorepatt  = "^ *(unknown|track *[0-9]*)* *$"    
    }

Define variables. What does the ^ mean in the pattern?
After that it's any amount of spaces followed by unknown or track followed by any amount of spaces and digits.
Followed by any amount of spaces till the end of line.
In regular expressions context ^ is a zero-width match at the beginning of the line.
So the above pattern matches:
0 or more white spaces, followed either by - unknown or track followed by 0 or more
white spaces followed by 1 or more digits - with 0 or more occurrences of the last unit
(described between - and - ), followed by 0 or more white space characters. The $
matches the end of the line (the opposite of ^).

Quote:
Code:
  $1 ~ /\.[Ff][Ll][Aa][Cc]$/ {

If any .flac is found at the end of a line do the following. Or .FLAC, .fLaC. Any combination of upper and lower case.
Correct.

Quote:
Code:
    if (fn) {

What is fn?
The previously saved filename.

Quote:
Code:
      for (i = 1; i <= flacn; i++) {
        if (tolower(tags[flac_tags[i]]) ~ ignorepatt) {
          invalid_tags[flac_tags[i]] = tags[flac_tags[i]]
          f || f++
          }
        }

Test all tags and remember the illegal ones.
Yes.

Quote:
Is 'f || f++' just a complex way to say 'f=1'?
No, it's a simple way to say f++ if f = 0 or unset Smilie

Quote:
Code:
      if (f) {
       print RS, fn, "has missing/invalid tags:"
        for (t in invalid_tags)
          printf "%s -> %s\n", t, invalid_tags[t]
        }      
      }

If found illegal tags, then sent them to the output stream. Why use print one tim and the other time prinf?
Correct. printf is used to format the output,
you can use print instead, if you find it
clearer:

Code:
print t, "-->", invalid_tags[t]

Yes, may be in this case it's clearer.

Quote:
Code:
    split(x, tags); split(x, invalid_tags)
    fn = $1; f = x
    }

I don't understand. I thought we were done for flac.
This lines resets temporary variables and empties temporary arrays.

Quote:
Code:
  /^Filename/ {
    if (fn) {
      for (i = 1; i <= mp3n; i++) {
        if (tolower(tags[mp3_tags[i]]) ~ ignorepatt) {
          invalid_tags[mp3_tags[i]] = tags[mp3_tags[i]]
          f || f++
          }
        }
      if (f) {
       print RS, fn, "has missing/invalid tags:"
        for (t in invalid_tags)
          printf "%s -> %s\n", t, invalid_tags[t]
        }      
      }    
    split(x, tags); split(x, invalid_tags)
    fn = $2; f = x
    }

Same but for mp3. I guess next step is to combine this two pieces of code to one.
Yes,
see the last version.

Quote:
Code:
 {
    if (split($0, tmp, "=") == 2) {
      $1 = tmp[1]; $2 = tmp[2]
      }
    tags[tolower($1)] = $2
    }

What does this mean?
The main code Smilie
These lines populates the tags array.
Quote:
Code:
  END {

Aha.. we are finished.
Yep, almost Smilie

Quote:
Code:
      if (fn ~      /\.[Ff][Ll][Aa][Cc]$/) {
        for (i = 1; i <= flacn; i++) {
          if (tolower(tags[flac_tags[i]]) ~ ignorepatt) {
            invalid_tags[flac_tags[i]] = tags[flac_tags[i]]
            f || f++
            }
          }
        if (f) {
          print RS, fn, "has missing/invalid tags:"
          for (t in invalid_tags)
            printf "%s -> %s\n", t, invalid_tags[t]
            }      
          }    
    else      
      for (i = 1; i <= mp3n; i++) {
        if (tolower(tags[mp3_tags[i]]) ~ ignorepatt) {
          invalid_tags[mp3_tags[i]] = tags[mp3_tags[i]]
          f || f++
          }
      if (f) {
        print RS, fn, "has missing/invalid tags:"
          for (t in invalid_tags)
          printf "%s -> %s\n", t, invalid_tags[t]
          }            
        }  
      }' FS=:

We checked, we sent everything to the output, so what is the need for almost the same code again?
Because every time we're working on the previously saved data,
therefore the last one should be processed after the entire input is read.

Quote:
But what is FS=:?
The field separator - :.
 
Login or Register to Ask a Question

Previous Thread | Next Thread

7 More Discussions You Might Find Interesting

1. Shell Programming and Scripting

Need a script for automation the convert a lot number audio files to another format

I have a lot number audio files in the MP3 proprietary format, I want to convert them to 'opus' the free and higher quality format, with keep metadata also. My selection command-line programs are SoX (Sound eXchange) for convert MP3 files to 'AIFF' format in order to keep quality and metadata*... (1 Reply)
Discussion started by: temp-usr
1 Replies

2. UNIX for Dummies Questions & Answers

Remove untagged and junk data from an XML

Hi All , I have seen a lot of code samples which suggest how to remove the junk data from and XML , I need a code in unix which removes the junk characters as well as the valid characters those are not in XML tags , for example my XML is as follows : <?xml version="1.0"... (6 Replies)
Discussion started by: IshuGupta
6 Replies

3. Slackware

Problems with audio recording in Audacity 2.0.5. Slackware64 14.1; Intel HD Audio.

I'm trying to record audio using Audacity 2.0.5 installed from SlackBuilds. My system is 64-bit Slackware 14.1 and a sound card is Intel HD Audio. I didn't change my sound system to OSS. (Default sound system in Slackware 14.1 is ALSA, isn't it?) First, I set Internal Microphone slider in KMix... (2 Replies)
Discussion started by: qzxcvbnm
2 Replies

4. Shell Programming and Scripting

Manipulating audio files server side

Hi All, I have next to zero knowledge on what I am about to ask so I will just ask it in plain English :) I am wondering how best to go about manipulating audio files server side. The manipulations required are join files one after the other, eg, audio1 + audio2 + audio3 + audio4 = audio5 ... (0 Replies)
Discussion started by: linuxgoat
0 Replies

5. Shell Programming and Scripting

Script to list files not present in audio.txt file

I am having following folder structure. /root/audios/pop /root/audios/jazz /root/audios/rock Inside those pop, jazz, rock folders there are following files, p1.ul, p2.ul, p3.ul, j1.ul, j2.ul, j3.ul, r1.ul, r2.ul, r3.ul And I have a file named as "audio.txt" in the path /root/audios,... (11 Replies)
Discussion started by: gopikrish81
11 Replies

6. UNIX for Dummies Questions & Answers

Find Audio Files With Specific Bandwidth?

Hi, I would like to write a shell script that will: -search the files of a specific user to find any audio files with a bandwidth iqual or greater than 192 kps - on the results i should see the file name along with all the whole file route and each file's size So I guess i should be using... (1 Reply)
Discussion started by: ubu-user
1 Replies

7. Programming

Playing Audio files in C

Hi All, Looking for an assistance on how to access the speakers of my machine and play audio files using C. Any tutorials will be of great help. Regards, Sayantan. (1 Reply)
Discussion started by: Sayantan
1 Replies
Login or Register to Ask a Question