Extract variables from filenames and output to file

Thread Tools Search this Thread
Top Forums Shell Programming and Scripting Extract variables from filenames and output to file
# 1  
Old 04-13-2011
Extract variables from filenames and output to file

I need some help. I have a list of files (thousands) and would like to extract some variables from the file name and save that to a file

The list of files look like:
I am trying to write the following script but I am stuck at how I can get thevariables 'doy' and 'yr' from each file and then combine into one file with two columns "yr doy' then write that to a file

#! /bin/bash

# if there's an error, using a number greater than 0 will exit with that
# number

cd $dirctry/

# process the rest of files
ls *.txt | sort -rn > imputFiles.dat
while read fliz ; do
if [ -e "$filz" ] ; then
# Extract a part of the filename for naming the output file

doy=`echo ${filenme:5:3}`
yr=`echo ${filenme:1:4}`
hr=`echo ${filenme:8:2}`
min=`echo ${filenme:10:2}`
# I need to get "yr" and "doy" for each file combine into one file with two columns
and output that to a file like shown below

echo "ERROR: Couldn't open file '$filz'" >&2
if [ $EXIT_ON_ERROR -gt 0 ] ; then
done < imputFiles.dat
And the required output is
2001 211
2001 213
2001 216
2001 221
2001 223
2001 224
2001 227
2001 232
Please any help and ideas will be highly appreciated
Thank you
# 2  
Old 04-13-2011
A single line of perl will do, let's say you save your file names to the file named 'data.txt'
 perl -pe  's/^.(\d{4})(\d{3}).*$/$1 $2/' data.txt

# 3  
Old 04-13-2011
Hi kevintse,

Thank you! That indeed does what I need. I don't fully understand what the fine detail of how this line does the job.

I have one extra question, what if I want to add another column to the output which is some value extracted from inside the file, would this be done within a command like this??

Thank you
# 4  
Old 04-13-2011
What I used in the one liner is called Regular Expression.
perl -pe  's/^.(\d{4})(\d{3}).*$/$1 $2/' data.txt

The slashes are just separators, the syntax is: s/Regex/replacement/
^ matches the very start of the string(a line in data.txt).
. matches a single character(any character).
(\d{4}) \d represents number from 0 to 9, 4 in the curly braces means the pattern will match 4 numbers, the outer braces capture the 4 numbers(yr you want), this is called a group in Regular Expression.
(\d{3}) is roughly the same as the previous pattern.
$1 $2 prints the 1st and 2nd groups captured by the pattern.

For your last question, yes, Perl can easily achieve what you want but may not be as easy as the previous command, shouldn't be complicated though
# 5  
Old 04-13-2011
Hi kevintse,

Thank you for that explanation and the time you have taken to help. Much appreciated. I am still figuring out how I will add another column there. One way I am thinking is if it is possible to put the 'filename' directly into this line command so that the command does not read the 'filename' from the file called data.

I thought something like this would work...
perl -pe 's/^.(\d{4})(\d{3}).*$/$1 $2/' 'e20012110129_xform_azisum_mlt_2.25_0.50.txt'
so that the output is
2001 211
then I can add another variable as a third column

Any advice will be appreciated
Thank you
# 6  
Old 04-14-2011
You may pipe the filename to the perl script:
echo 'e20012110129_xform_azisum_mlt_2.25_0.50.txt' | perl -pe 's/^.(\d{4})(\d{3}).*$/$1 $2/'

Login or Register to Ask a Question

Previous Thread | Next Thread

10 More Discussions You Might Find Interesting

1. UNIX for Beginners Questions & Answers

Match file and extract the output

Hi All I have 2 file . I need match the files based on key and then form a third file which have the matching values FILE1: 10264;K*AD 10265;K*AIR 10266;K*AUTO 10267;K*BABY 10268;K* FOOD FILE2: 10264;1055.83 10265;716.94 10267;331.80 10268;23283.33 OUTPUT (Needed)... (2 Replies)
Discussion started by: arunkumar_mca
2 Replies

2. Shell Programming and Scripting

How to read each line from input file, assign variables, and echo to output file?

I've got a file that looks like this (spaces before first entries intentional): 12345650-000005000GL140227 ANNUAL HELC FEE EN 22345650-000005000GL140227 ANNUAL HELC FEE EN 32345650-000005000GL140227 ANNUAL HELC FEE EN I want to read through the file line by line,... (6 Replies)
Discussion started by: Scottie1954
6 Replies

3. UNIX for Dummies Questions & Answers

Create csv with output filenames and file size

Hello All, Here is seeking a bit of help in trying to solve a problem. I am required to create a csv file as shown below: output.csv -> output_1,output_2,output_3,...,output_<N> filename1:20,filename2:30,filename3:30,...,filename<N>:30 by listing output_1, output_2,... , output<N> as... (3 Replies)
Discussion started by: vkumbhakarna
3 Replies

4. Shell Programming and Scripting

Extract unique filenames

Hi Unix Gurus, In a script, I am trying to extract unique text from a set of filenames. I have certain files like below in a directory: OPEN_INV_01012011.xls OPEN_INV_01022011.xls OPEN_INV_01032011.xls CLOSE_INV_01012011.xls CLOSE_INV_01022011.xls I need to extract just "OPEN_INV_" ... (4 Replies)
Discussion started by: shankar1dada
4 Replies

5. Shell Programming and Scripting

Read file and for each line replace two variables, add strings and save output in another file

Hi All, I have a file, let's call it "info.tmp" that contains data like this .. ABC123456 PCX333445 BCD789833 I need to read "info.tmp" and for each line add strings in a way that the final output is put /logs/ua/dummy.trigger 'AAA00001.FTP.XXX.BLA03A01.xxxxxx(+1)' where XXX... (5 Replies)
Discussion started by: Andy_ARG
5 Replies

6. Shell Programming and Scripting

Spaces in filenames located in variables in shell.

Greetings. I am trying to do a script that will do some file copying for me. Unfortunately I have spaces in the directory names (which I cannot change) and the result is someone hard to achieve in shell scripts. I have searched everywhere on the web but does not manage to find the answer to... (3 Replies)
Discussion started by: Mr.Glaurung
3 Replies

7. Shell Programming and Scripting

Using AWK BEGIN to extract file header info into variables

Hi Folks, I've searched for this for quite a while, but can't find any solution - hope someone can help. I have various files with standard headers. eg. <HEADER> IP: Username: Joe Time: 12:00:00 Date: 23/05/2010 </HEADER> This is a test and this part can be any size... (6 Replies)
Discussion started by: damoske
6 Replies

8. Shell Programming and Scripting

awk splits file in to multiple - how to get filenames in variables withing ksh?

Hi all, I'm using awk in a .ksh script to split one file by line prefix into different files (up to 4). The files are named after the prefix in the line, and a sequence no. Is there any way to get the filenames in to variables too? I need _ftpfile1, _ftpfile2, _ftpfile3 and _ftpfile4 which are... (2 Replies)
Discussion started by: spidermike
2 Replies

9. Shell Programming and Scripting

How to extract date out of this filenames

I have filenames filenameA_fg_MMDDYY.tar.gz filenameASPQ_fg_MMDDYY.tar.gz filenameAFTOPHYYINGH_fg_MMDDYY.tar.gz filenameAGHYSW_fg_MMDDYY.tar.gz Is there a way I can extract the date out of these filenames? Thanks in advance (2 Replies)
Discussion started by: RubinPat
2 Replies

10. Shell Programming and Scripting

Passing filenames/variables into a script

Hi! I'm somewhat new to Unix (I use it to fiddle, but not seriously), and at my work I have come to use it quite a bit. My question is this: I want to create a script to take the tedium out of repetitious tasks. However, I need to pass a file name into the script in order for it to work. ... (5 Replies)
Discussion started by: MrToast
5 Replies
Login or Register to Ask a Question

Featured Tech Videos