BASH Regex - get filename tags, labels and descriptions


 
Thread Tools Search this Thread
Top Forums Shell Programming and Scripting BASH Regex - get filename tags, labels and descriptions
# 1  
Old 10-13-2018
BASH Regex - get filename tags, labels and descriptions

Hi, I am trying to switch from windows to linux.
I have been using Autohotkey scripts for some little things.
I started writing some bash scripts for NEMO browser in linux mint and I am trying to convert some of AHK scripts but as I am not a programmer. I have a hard time with regex stuff.

What I did is split file path and I got it working ok. -- Directory, FileName, Filename-Without-Extension, Extension

Now every filename may or may not contain [tags], {labels} and (description) in brackets at the end of the filename.
Where I am stuck is regex that works in Autohotkey is not in BASH. I tried sed, awk, but I have no idea what I am doing.

example filename:
multiple.{1}(1)[1] dots.and spaces {labels}(description)[tag1 tag2].txt
filename with only tags [tag1 tag2 tag3].txt

Note:
Tags, labels or description can be any words or numbers, not just what I have in example.
Not every filename will have all 3. Some could have only tags, others only descriptions or labels.

need to split and get this:
shortName = "multiple.{1}(1)[1] dots.and spaces" -- filename without any tags,labels and descriptions
tags = "tag1 tag2" -- tags without brackets
labels = "label" -- labels without brackets
description = "description" -- description without brackets


Original regex from Autohotkey that works in all of my scripts

Code:
; NameNoExt is filename without extension
; get filename only without tags[], labels{} or description() 
ShortName:=RegExReplace(NameNoExt, "\s*(\([^()]+\)|{[}]+|\[[^][]+\])+$")
; Get tags
Tags:=RegExReplace(NameNoExt, ".*\[(.*)?\](?!\s).*$", "$1")
; Get labels
Label:= RegExReplace(NameNoExt, ".*\{(.*)?\}(?!\s).*$", "$1")
; Get description
Description := RegExReplace(NameNoExt, ".*\((.*)?\)(?!\s).*$", "$1")



Here is what I have in BASH
.. and it is going nowere..
So to mention again, this is where I need help with regex splitting these tags, labels and descriptions.

Code:
str="multiple.{1}(1)[1] dots.and spaces {lab}(desc)[tag1 tag2].txt"

IFS='
'

NameNoExt=`echo "${str%.*}"`        # for filename with multiple dots

shortName="???"        # get "multiple.{1}(1)[1] dots"

#regexp=".*\[(.*)?\](?!\s).*$"    # from autohotkey
regexp="\[([^)]+)\]+$"
if [[ $NameNoExt =~ $regexp ]]
then 
  TAG="${BASH_REMATCH[1]}"
else 
  TAG="Nothing"
fi
#regexp=".*\{(.*)?\}(?!\s).*$"    # from autohotkey
regexp="\{([^}]+)\}+$"
if [[ $NameNoExt =~ $regexp ]]
then 
  LAB="${BASH_REMATCH[1]}"
else 
  LAB="Nothing"
fi
#regexp=".*\((.*)?\)(?!\s).*$"    # from autohotkey
regexp="\(([^)]+)\)+$"
if [[ $NameNoExt =~ $regexp ]]
then 
  DESC="${BASH_REMATCH[1]}"
else 
  DESC="Nothing"
fi
zenity --info --text "`printf "Input = \"$str\"\n\nNameNoExt = \"$NameNoExt\"\nShort Name  = \"$shortName\"\nTags  = \"$TAG\"\nLabels  = \"$LAB\"\nDescription  = \"$DESC\""`"

I highly appreciate any help with this.
Thank you all.
# 2  
Old 10-14-2018
Hi, try
Code:
regexp='.*\[([^)]+)\]'

,
Code:
regexp='.*\{([^}]+)\}'

and
Code:
regexp='.*\(([^)]+)\)'

Respectively..
# 3  
Old 10-15-2018
Hi, thank you.
That works at the given example. But if I am to remove one of the end brackets (like [*] ), this regex will find other bracket if exist within a filename ( like that [1] in the middle of the name).

But in most cases 99%, filenames will not have any extra brackets within the filename, mostly at the end. Just want to have it so there are no mistakes in case one pops up..

Is there any way to check if brackets are at the tail separated by space from filename (without extension)? And if they are then get contents, if not skippit.
As an example:
some [1] name [tag]
some [1] name <-- does not have brackets at the tail

Anyway, thank you very much for your reply. It helped a lot. I might end up just using that for now until I find a better solution.
Cheers
# 4  
Old 10-21-2018
Ok, I think I got this thing working now.
I did not know how to extract content A,B,C from "only" end brackets so I had to create new VAR from the tail brackets (if existed) and then extract contents of A,B and C from that. Otherwise regex would be getting 1,2,3 from brackets in the filename.

Anyway... it is ok as it is now but if there is a better solution I would love to learn it.
note: this is for string testing only, once it is in a real script I would have checking to see if file or directory before getting shortName and extension.

Code:
#!/bin/bash
test_str=(
    'test1.{1}(2)[3] dots.spaces {A1}(B1)[C1].ext'
    'test2.[3]{1}(2) dots.spaces [C1](B1){A1}.ext'
    'test3 [3].com [C1].ext'
    'test4 {1}(2)[3]dots.ext'
    'test4 {1} (2) [3] dots.ext'
    'test6 (B1)[C1].ext'
    'test7[C1](B1).ext'
)
IFS='
'
regexp='\s*(\([^)]+\)|\{[^}]+\}|\[[^][]+\])+$'
subst=""
regexTag='.*\[([^]]+)\]'
regexLab='.*\{([^}]+)\}'
regexDes='.*\(([^)]+)\)'


for p in "${test_str[@]}"
do
    NameNoExt=`echo "${p%.*}"`
    if [[ $NameNoExt =~ $regexp ]]; then
        tailBrackets="${BASH_REMATCH[0]}";
        shortName=$(echo "$NameNoExt" | sed -r 's/'$regexp'/'$subst'/g');
    else
        shortName="$NameNoExt"
        tailBrackets=""
    fi
    # if tail bracket exist then extract A,B,C
    if [ $tailBrackets ]; then
        if [[ $tailBrackets =~ $regexTag ]]
        then 
          TAG="${BASH_REMATCH[1]}"
        else 
          TAG=""
        fi
        if [[ $tailBrackets =~ $regexLab ]]
        then 
          LAB="${BASH_REMATCH[1]}"
        else 
          LAB=""
        fi
        if [[ $tailBrackets =~ $regexDes ]]
        then 
          DES="${BASH_REMATCH[1]}"
        else 
          DES=""
        fi
    else
        TAG=""
        LAB=""
        DES=""
    fi
    zenity --info --text "`printf "Filename = \"$p\"\n\nNameNoExt = \"$NameNoExt\"\nShort Name  = \"$shortName\"\nTail Brackets = \"$tailBrackets\"\nA  = \"$LAB\"\nB  = \"$DES\"\nC  = \"$TAG\""`"
done

Login or Register to Ask a Question

Previous Thread | Next Thread

10 More Discussions You Might Find Interesting

1. Shell Programming and Scripting

Using RegEx with variable within bash if [[ ]]

I stumbled upon a problem, which I simplified to this: There is a list of numbers, stored in variable $LIST, lets use `seq 5 25` for demonstration. There is a number that should be compared against this list. For demonstration I use user input - read VALUE I am trying to compare RegEx... (2 Replies)
Discussion started by: Zorbeg
2 Replies

2. UNIX for Dummies Questions & Answers

Regex for (a|b) in bash

I am trying to find files using the following by using simple bash script: if -2014 ]]; then echo "yes";fi What I need to find are any files with date 08-**-2014 so August 2014 any files. I can use if -2014 ]]; then echo "yes";fi That works fine. How do I get files beginning with 08... (1 Reply)
Discussion started by: newbie2010
1 Replies

3. AIX

AIX Command LED Descriptions

Dear master Unix.com, I ask about command, what is command the meaning this? and what the function? /usr/lpp/diagnostics/bin/usysfault -s normal regards, -Ruhul (3 Replies)
Discussion started by: williamen
3 Replies

4. Shell Programming and Scripting

Regex for filename in grep

I want to print the filename keyword="XXTNL_AVSKRIV2ING" ftype="sql' I wan to search the keyword in all the sql files and the output shoul dbe filename:count grep -iwc "$keyword" *.$ftype | grep -v ":0$" But the output does not dispaly the filename which contains space as... (4 Replies)
Discussion started by: millan
4 Replies

5. UNIX for Dummies Questions & Answers

Need help with Regex for bash

Hi, I am trying to match this word: hexagon-bx.mydomain.com with regex. I have tried this: "\.*]*$" So far I have not been successful. I also need to make sure that the regex will match words that just have lowercase letters and numbers in them, such as camera01. How can I create such an... (5 Replies)
Discussion started by: newbie2010
5 Replies

6. Shell Programming and Scripting

Bash regex help

I've been using the following regex below in a bash script on RHEL 5.5 using version GNU bash, version 3.2.25(1)-release I've tried using the script on RHEL 6.3 which uses GNU bash, version 4.1.2(1)-release I assume there's been alot of changes to bash since that's quite a jump in revisions.... (12 Replies)
Discussion started by: woodson2
12 Replies

7. Shell Programming and Scripting

[BASH] Allow name with spaces (regex)

Hey all, I have a very simple regular expression that I use when I want to allow only letters with spaces. (I know this regex has a lot of shortcomings, but I'm still trying to learn them) isAlpha='^*$'However, when I bring this over to BASH it doesn't allow me to enter spaces. I use the... (3 Replies)
Discussion started by: whyte_rhyno
3 Replies

8. Shell Programming and Scripting

Bash regex

Hello everybody, I'm clearly not an expert in bash scripting as I've written maybe less than 10 scripts in my life. I'm trying to strip an xml string removing every tag in it. I'm using bash substitution to do so, but apparently I missed something about what is a regex for bash ... As an... (4 Replies)
Discussion started by: kerloi
4 Replies

9. Shell Programming and Scripting

regex test in bash

Hi I want to do a regex test and branch based on the test result, but this doesn't seems to work :confused: if \) ]] then echo success else echo failed fi (1 Reply)
Discussion started by: subin_bala
1 Replies

10. UNIX for Advanced & Expert Users

/etc/vfstab Field Descriptions

While I was reading a Sun SysAdmin Guide, I came across this point... /etc/vfstab Field Descriptions mount at boot - The root (/), /usr and /var file systems are not mounted from the vfstab file initially. This field should always be set to no for these file systems and for virtual file... (2 Replies)
Discussion started by: minazk
2 Replies
Login or Register to Ask a Question