Pattern Match FileNames

 
Thread Tools Search this Thread
Top Forums UNIX for Beginners Questions & Answers Pattern Match FileNames
# 1  
Old 04-18-2018
Pattern Match FileNames

I am on AIX.

I need to list the contents of the directory based on a pattern and write an XML output file with file names.

If a filename does NOT match the below pattern then write an OUTPUT xml file in the below xml format

Pattern


Code:
Starts with (.abc) and contains (def)
Starts with (.abc) and contains (pqr)
Ends with (.xml) & conatins (xyz)
Starts with (.tvs)
Contains(.hij)

Additionally irrespective of the patter match if the FileName Contains (space character) include that FileName in the Output XML file

Output File Structure

Code:
<Files>
<FiileName>LMN.txt</FileName>
<FileName>OTS.txt</FileName>
</Files>

Please advise

Last edited by RudiC; 04-18-2018 at 04:45 AM..
This User Gave Thanks to techedipro For This Post:
# 2  
Old 04-18-2018
Some sample data would help as the inverse complex pattern is beyond my imagination. What's your shell / version? Your files have leading dots? I guess <FiileName> is a typo?
This User Gave Thanks to RudiC For This Post:
# 3  
Old 04-18-2018
Hello techedipro,

Which AIX version are you running? It might make a difference.

Are these conditions to be AND-ed together or OR-ed together? Please be specific else we might go the wrong way about this. As RudiC mentions, it would be better to know what to positively look for rather than try to strip out the unwanted.

Perhaps (depending on the size of the file list being processed) you could use these tests to build a list of files that you want to exclude and then use something like grep -vfF remove_these main_file_list > wanted_file_list to get you close, but if the remove list gets large then results can be less predictable and slow. It would cost processing and IO to go this long way round, but it's possible if there is not better logic to positively select what you want to report.


The easiest way to get your actual output file when you have a list of names may be something like:-
Code:
awk 'BEGIN {print "<Files>"} ; {print "<FileName>"$0"</FileName>"} ; END {print "</Files>"} ' wanted_file_list  >  output_file


Of course, there may be a better way to blend it into a single operation, saving the processing and IO cost, but you need to help us understand the context. Some examples would be good.



I hope that this helps,
Robin
These 2 Users Gave Thanks to rbatte1 For This Post:
# 4  
Old 04-18-2018
Should you run a recent shell (bash, ksh) that provides "extended globbing" of "pattern-lists", you might use this to feed into rbatte1's awk proposal:
Code:
ls  @(!(@(.abc*@(def|pqr)*|.tvs*|*xyz*.xml|*.hij|.hij*))|@(*\ *|.*\ *))

These 2 Users Gave Thanks to RudiC For This Post:
# 5  
Old 04-18-2018
RudiC & rbatte1

Thanks for your valuable inputs.

version : Version M-11/16/88f

I have made minor change to the pattern as well as corrected the typo on the output file and also included sample filenames and the expected output file.

If a filename does NOT match the below pattern and if any of the FileNames contain spaces in them then write an OUTPUT xml file with the FileNames in the below xml format


Code:
Ends with (.abc) and contains (DEF)
Ends with (.abc) and contains (PQR)
Ends with (.xml) and conatins (XYZ)
Starts with (TVS)
Starts with (TVS) and contains(SPR)
Contains(HIJ)


FileNames

Code:
cqa_20180405_tom_DEF.abc
uvw_bs_PQR_041118120208.abc
wvu_XYZ_041118120208.xml
TVS_~tosp.sh
TVS_SPR.txt
HIJ_03_15_2018.xml
LMN.txt
OTS.txt
iws_ eti-.oiy .txt


OutputFile

Code:
<Files>
<FileName>LMN.txt</FileName>
<FileName>OTS.txt</FileName>
<FileName>iws_ eti-.oiy .txt</FileName>
</Files>

This User Gave Thanks to techedipro For This Post:
Login or Register to Ask a Question

Previous Thread | Next Thread

10 More Discussions You Might Find Interesting

1. Shell Programming and Scripting

Match Pattern and print pattern and multiple lines into one line

Hello Experts , require help . See below output: File inputs ------------------------------------------ Server Host = mike id rl images allocated last updated density vimages expiration last read <------- STATUS ------->... (4 Replies)
Discussion started by: tigerhills
4 Replies

2. UNIX for Dummies Questions & Answers

Distinct filenames pattern

Hi All, I am working on designing the archival process for my system, where I will have to find distinct file names ( when excluded time_stamp extention ) from given directory and for each file type keep the latest and move all other older to different location ( lets say dir Back ). Below are... (2 Replies)
Discussion started by: freakabhi
2 Replies

3. Shell Programming and Scripting

Rearrange or replace only the second line after pattern match or pattern match

Im using the command below , but thats not the output that i want. it only prints the odd and even numbers. awk '{if(NR%2){print $0 > "1"}else{print $0 > "2"}}' Im hoping for something like this file1: Text hi this is just a test text1 text2 text3 text4 text5 text6 Text hi... (2 Replies)
Discussion started by: invinzin21
2 Replies

4. Shell Programming and Scripting

sed : match one pattern then the next consecutive second pattern not working

Ive used this snippet of code on a solaris box thousands of times. But it isnt working on the new linux box sed -n '/interface LoopBack0/{N;/ ip address /p;}' *.conf its driving me nuts !! Is there something Im missing ? (7 Replies)
Discussion started by: popeye
7 Replies

5. UNIX for Dummies Questions & Answers

Match Pattern after certain pattern and Print words next to Pattern

Hi experts , im new to Unix,AWK ,and im just not able to get this right. I need to match for some patterns if it matches I need to print the next few words to it.. I have only three such conditions to match… But I need to print only those words that comes after satisfying the first condition..... (2 Replies)
Discussion started by: 100bees
2 Replies

6. Shell Programming and Scripting

Awk to match a pattern and perform a search after the first pattern

Hello Guyz I have been following this forum for a while and the solutions provided are super useful. I currently have a scenario where i need to search for a pattern and start searching by keeping the first pattern as a baseline ABC DEF LMN EFG HIJ LMN OPQ In the above text i need to... (8 Replies)
Discussion started by: RickCharles
8 Replies

7. Shell Programming and Scripting

Need one liner to search pattern and print everything expect 6 lines from where pattern match made

i need to search for a pattern from a big file and print everything expect the next 6 lines from where the pattern match was made. (8 Replies)
Discussion started by: chidori
8 Replies

8. Shell Programming and Scripting

BASH find filenames in list that match certain "pattern."

I guess by "pattern," I mean something different from how that word is defined in the Linux world. If you take $ to mean a letter (a-z) and # to mean a number (0-9), then the pattern I'm trying to match is as follows: $$$##-####-###-###.jpg I'd like to write a script that reads in a list of files... (4 Replies)
Discussion started by: SilversleevesX
4 Replies

9. Shell Programming and Scripting

parse apl-numeric codes from filenames, and match them to entries in database

Hello, I am new to Unix scripting, and would like some help with my issue: I have vairous files having some alphanumeric codes in them e.g. 10000-01 34440TE 34590SR All these codes are stored in the database, and I need to parse these codes out of these filenames, and match them... (2 Replies)
Discussion started by: mvaidya
2 Replies

10. Shell Programming and Scripting

Match first pattern first then extract second pattern match

My input file: <accession>Q91G55</accession> <name>043L_IIV6</name> <protein> <recommendedName> <location> <position position="294"/> </location> <fullName>Uncharacterized protein 043L</fullName> <accession>P18556</accession> <name>1106L_ASFB7</name> <protein> <recommendedName>... (5 Replies)
Discussion started by: patrick87
5 Replies
Login or Register to Ask a Question