Home Man
Search
Today's Posts
Register

BSD, Linux, and UNIX shell scripting — Post awk, bash, csh, ksh, perl, php, python, sed, sh, shell scripts, and other shell scripting languages questions here.

BASH Regex - get filename tags, labels and descriptions

Tags
bash, file, regex, shell scripts, tags

Login to Reply

 
Thread Tools Search this Thread
# 1  
Old 1 Week Ago
BASH Regex - get filename tags, labels and descriptions

Hi, I am trying to switch from windows to linux.
I have been using Autohotkey scripts for some little things.
I started writing some bash scripts for NEMO browser in linux mint and I am trying to convert some of AHK scripts but as I am not a programmer. I have a hard time with regex stuff.

What I did is split file path and I got it working ok. -- Directory, FileName, Filename-Without-Extension, Extension

Now every filename may or may not contain [tags], {labels} and (description) in brackets at the end of the filename.
Where I am stuck is regex that works in Autohotkey is not in BASH. I tried sed, awk, but I have no idea what I am doing.

example filename:
multiple.{1}(1)[1] dots.and spaces {labels}(description)[tag1 tag2].txt
filename with only tags [tag1 tag2 tag3].txt

Note:
Tags, labels or description can be any words or numbers, not just what I have in example.
Not every filename will have all 3. Some could have only tags, others only descriptions or labels.

need to split and get this:
shortName = "multiple.{1}(1)[1] dots.and spaces" -- filename without any tags,labels and descriptions
tags = "tag1 tag2" -- tags without brackets
labels = "label" -- labels without brackets
description = "description" -- description without brackets


Original regex from Autohotkey that works in all of my scripts

Code:
; NameNoExt is filename without extension
; get filename only without tags[], labels{} or description() 
ShortName:=RegExReplace(NameNoExt, "\s*(\([^()]+\)|{[}]+|\[[^][]+\])+$")
; Get tags
Tags:=RegExReplace(NameNoExt, ".*\[(.*)?\](?!\s).*$", "$1")
; Get labels
Label:= RegExReplace(NameNoExt, ".*\{(.*)?\}(?!\s).*$", "$1")
; Get description
Description := RegExReplace(NameNoExt, ".*\((.*)?\)(?!\s).*$", "$1")



Here is what I have in BASH
.. and it is going nowere..
So to mention again, this is where I need help with regex splitting these tags, labels and descriptions.

Code:
str="multiple.{1}(1)[1] dots.and spaces {lab}(desc)[tag1 tag2].txt"

IFS='
'

NameNoExt=`echo "${str%.*}"`        # for filename with multiple dots

shortName="???"        # get "multiple.{1}(1)[1] dots"

#regexp=".*\[(.*)?\](?!\s).*$"    # from autohotkey
regexp="\[([^)]+)\]+$"
if [[ $NameNoExt =~ $regexp ]]
then 
  TAG="${BASH_REMATCH[1]}"
else 
  TAG="Nothing"
fi
#regexp=".*\{(.*)?\}(?!\s).*$"    # from autohotkey
regexp="\{([^}]+)\}+$"
if [[ $NameNoExt =~ $regexp ]]
then 
  LAB="${BASH_REMATCH[1]}"
else 
  LAB="Nothing"
fi
#regexp=".*\((.*)?\)(?!\s).*$"    # from autohotkey
regexp="\(([^)]+)\)+$"
if [[ $NameNoExt =~ $regexp ]]
then 
  DESC="${BASH_REMATCH[1]}"
else 
  DESC="Nothing"
fi
zenity --info --text "`printf "Input = \"$str\"\n\nNameNoExt = \"$NameNoExt\"\nShort Name  = \"$shortName\"\nTags  = \"$TAG\"\nLabels  = \"$LAB\"\nDescription  = \"$DESC\""`"

I highly appreciate any help with this.
Thank you all.
# 2  
Old 1 Week Ago
Hi, try
Code:
regexp='.*\[([^)]+)\]'

,
Code:
regexp='.*\{([^}]+)\}'

and
Code:
regexp='.*\(([^)]+)\)'

Respectively..
# 3  
Old 1 Week Ago
Hi, thank you.
That works at the given example. But if I am to remove one of the end brackets (like [*] ), this regex will find other bracket if exist within a filename ( like that [1] in the middle of the name).

But in most cases 99%, filenames will not have any extra brackets within the filename, mostly at the end. Just want to have it so there are no mistakes in case one pops up..

Is there any way to check if brackets are at the tail separated by space from filename (without extension)? And if they are then get contents, if not skippit.
As an example:
some [1] name [tag]
some [1] name <-- does not have brackets at the tail

Anyway, thank you very much for your reply. It helped a lot. I might end up just using that for now until I find a better solution.
Cheers
# 4  
Old 2 Days Ago
Ok, I think I got this thing working now.
I did not know how to extract content A,B,C from "only" end brackets so I had to create new VAR from the tail brackets (if existed) and then extract contents of A,B and C from that. Otherwise regex would be getting 1,2,3 from brackets in the filename.

Anyway... it is ok as it is now but if there is a better solution I would love to learn it.
note: this is for string testing only, once it is in a real script I would have checking to see if file or directory before getting shortName and extension.

Code:
#!/bin/bash
test_str=(
    'test1.{1}(2)[3] dots.spaces {A1}(B1)[C1].ext'
    'test2.[3]{1}(2) dots.spaces [C1](B1){A1}.ext'
    'test3 [3].com [C1].ext'
    'test4 {1}(2)[3]dots.ext'
    'test4 {1} (2) [3] dots.ext'
    'test6 (B1)[C1].ext'
    'test7[C1](B1).ext'
)
IFS='
'
regexp='\s*(\([^)]+\)|\{[^}]+\}|\[[^][]+\])+$'
subst=""
regexTag='.*\[([^]]+)\]'
regexLab='.*\{([^}]+)\}'
regexDes='.*\(([^)]+)\)'


for p in "${test_str[@]}"
do
    NameNoExt=`echo "${p%.*}"`
    if [[ $NameNoExt =~ $regexp ]]; then
        tailBrackets="${BASH_REMATCH[0]}";
        shortName=$(echo "$NameNoExt" | sed -r 's/'$regexp'/'$subst'/g');
    else
        shortName="$NameNoExt"
        tailBrackets=""
    fi
    # if tail bracket exist then extract A,B,C
    if [ $tailBrackets ]; then
        if [[ $tailBrackets =~ $regexTag ]]
        then 
          TAG="${BASH_REMATCH[1]}"
        else 
          TAG=""
        fi
        if [[ $tailBrackets =~ $regexLab ]]
        then 
          LAB="${BASH_REMATCH[1]}"
        else 
          LAB=""
        fi
        if [[ $tailBrackets =~ $regexDes ]]
        then 
          DES="${BASH_REMATCH[1]}"
        else 
          DES=""
        fi
    else
        TAG=""
        LAB=""
        DES=""
    fi
    zenity --info --text "`printf "Filename = \"$p\"\n\nNameNoExt = \"$NameNoExt\"\nShort Name  = \"$shortName\"\nTail Brackets = \"$tailBrackets\"\nA  = \"$LAB\"\nB  = \"$DES\"\nC  = \"$TAG\""`"
done

Login to Reply

« Previous Thread | Next Thread »
Thread Tools Search this Thread
Search this Thread:

Advanced Search
Display Modes

More UNIX and Linux Forum Topics You Might Find Helpful
Thread Thread Starter Forum Replies Last Post
Using RegEx with variable within bash if [[ ]] Zorbeg Shell Programming and Scripting 2 06-02-2016 01:55 AM
Regex for (a|b) in bash newbie2010 UNIX for Dummies Questions & Answers 1 07-10-2015 11:30 AM
AIX Command LED Descriptions williamen AIX 3 11-26-2013 07:28 AM
Regex for filename in grep millan Shell Programming and Scripting 4 07-01-2013 04:46 AM
Need help with Regex for bash newbie2010 UNIX for Dummies Questions & Answers 5 12-30-2012 03:54 PM
Bash regex help woodson2 Shell Programming and Scripting 12 10-10-2012 11:54 PM
[BASH] Allow name with spaces (regex) whyte_rhyno Shell Programming and Scripting 3 12-27-2011 06:47 PM
Bash regex kerloi Shell Programming and Scripting 4 04-18-2011 03:18 AM
regex test in bash subin_bala Shell Programming and Scripting 1 04-16-2008 03:27 AM
/etc/vfstab Field Descriptions minazk UNIX for Advanced & Expert Users 2 01-21-2003 03:07 AM


All times are GMT -4. The time now is 08:41 AM.

Unix & Linux Forums Content Copyright©1993-2018. All Rights Reserved.
UNIX.COM Login
Username:
Password:  
Show Password