Extract information from file


 
Thread Tools Search this Thread
Top Forums Shell Programming and Scripting Extract information from file
# 1  
Old 05-10-2017
Extract information from file

In a particular directory, there can be 1000 files like below.

filename is job901.ksh
Code:
#!/bin/ksh
cront -x << EOJ
submit file=$PRODPATH/scripts/genReport.sh maxdelay=30
      &node=xnode01
      tname=job901
      &pfile1=/prod/mldata/data/test1.dat
      &pfile2=/prod/mldata/data/test2.dat
      &metafile1=test1.met
      &metafile2=test2.met
      &jobname=job901
      &priority=10;
      EOJ
exit


Want to read similar all files and extract the info. That means the output would have info for each
of the files with such format.The output for one file would be expected like below.

Is it possible?


HTML Code:
File      | Jobname |  node  | pfile               | metafile            | tname | priority | delay
job901.ksh| job901  | xnode01| test1.dat,test2.dat | test1.met,test2.met | job901| 10 | 30
Thanks.
# 2  
Old 05-10-2017
Hello Vedanta,

Could you please try following and let me know if this helps you.
Code:
awk -F'[=/]' 'BEGIN{print "File      | Jobname |  node  | pfile               | metafile            | tname | priority | delay"} /maxdelay/{delay=$NF;next} /node/{node=$NF;next} /tname/{name=$NF;next} /pfile/{file=file?file","$NF:$NF;next} /metafile/{metafile=metafile?metafile","$NF:$NF;next} /jobname/{jobname=$NF;next} /priority/{pri=$NF;next} /exit/{print FILENAME OFS jobname OFS node OFS file OFS metafile OFS name OFS pri OFS delay;jobname=node=file=metafile=name=pri=delay="";}' OFS="|  " *.ksh

I haven't tested it with 1000 or more files, let us know if this helps you.
EDIT: Adding a non-one liner form of solution too here.
Code:
awk -F'[=/]' 'BEGIN{
                        print "File      | Jobname |  node  | pfile               | metafile            | tname | priority | delay"
                   }
              /maxdelay/{
                                delay=$NF;
                                next
                        }
              /node/    {
                                node=$NF;
                                next
                        }
              /tname/   {
                                name=$NF;
                                next
                        }
              /pfile/   {
                                file=file?file","$NF:$NF;
                                next
                        }
              /metafile/{
                                metafile=metafile?metafile","$NF:$NF;
                                next
                        }
              /jobname/ {
                                jobname=$NF;
                                next
                        }
              /priority/{
                                pri=$NF;
                                next
                        }
              /exit/    {
                                print FILENAME OFS jobname OFS node OFS file OFS metafile OFS name OFS pri OFS delay;
                                jobname=node=file=metafile=name=pri=delay="";
                        }
             ' OFS="| "   *.ksh

Thanks,
R. Singh
This User Gave Thanks to RavinderSingh13 For This Post:
# 3  
Old 05-10-2017
Here is a bash solution with associative arrays, requires bash 4.
Code:
#!/bin/bash
cols="node tname pfile metafile"
space=20
declare -A C

# print the header
headersep=""
for col in $cols
do
  printf "${headersep}%${space}s" "$col"
  headersep=" | "
done
printf "\n"

# loop over the files
for jfile in job[0-9]*.ksh
do
  # loop over the lines, collect values in hash C[]
  while IFS="=" read key val
  do
    [ -z "${#val}" ] && continue
    case $key in
    *"&node") C[node]=$val
    ;;
    *"&tname") C[tname]=$val
    ;;
    *"&pfile"*) C[pfile]=${C[pfile]}${C[pfile]:+,}${val##*/}
    ;;
    *"&metafile"*) C[metafile]=${C[metafile]}${C[metafile]:+,}${val##*/}
    ;;
    esac
  done < "$jfile"

  # print and clear C[]
  sep=""
  for col in $cols
  do
    printf "${sep}%${space}s" "${C[$col]}"
    unset C[$col]
    sep=$headersep
  done
  printf "\n"
done

It is not yet complete.
But once understood how it works it is easy to expand.
# 4  
Old 05-11-2017
Thanks Guys for the help.
Is there a way to use
Code:
awk '/^submit/{print FILENAME;nextfile}' *.sh

instead of
Code:
*.sh

with awk in the script above.
There can be different files with different pattern and I want to pick only those files (1000 files in this case ) which would have content starting with submit.
can the last line be modified to select only those files that have content starting with line 'submit'
Code:
' OFS="| "   *.ksh

Code:
' OFS="| "   awk '/^submit/{print FILENAME;nextfile}' *.sh # pick only those selected files

@Ravinder, It worked. Many thanks!

Last edited by Scrutinizer; 05-11-2017 at 04:24 PM.. Reason: code tags
# 5  
Old 05-11-2017
Quote:
Originally Posted by vedanta
Thanks Guys for the help.
Is there a way to use awk '/^submit/{print FILENAME;nextfile}' *.sh instead of *.sh with awk in the script above.
There can be different files with different pattern and I want to pick only those files (1000 files in this case ) which would have content starting with submit.
can the last line be modified to select only those files that have content starting with line 'submit'

' OFS="| " *.ksh


' OFS="| " awk '/^submit/{print FILENAME;nextfile}' *.sh # pick only those selected files


@Ravinder, It worked. Many thanks!
Hello Vedanta,

Please always open a new thread for a new question, now coming to your question, if you want to print only the file names out of many files which have string submit in them starting of the line then you could make a slight change into your code.
Code:
 awk '/^submit/{print FILENAME;nextfile}' *.sh

Since you haven't told us about the Input_files and their look so I am removing the OFS part here which is anyways not require since you are printing only the file names, in case your Input_files are | delimited then you could add -F"|" into this above code after awk and your string submit is on a specific field you could look only for that field then.

Thanks,
R. Singh
# 6  
Old 05-12-2017
Hi,

It does not work when I replaced the *.sh by
Code:
awk '/^submit/{print FILENAME;nextfile}' *.sh

in the orginal solution you provided. It gives error can not open file awk
# 7  
Old 05-12-2017
Quote:
Originally Posted by vedanta
Hi,
It does not work when I replaced the *.sh by
Code:
awk '/^submit/{print FILENAME;nextfile}' *.sh

in the orginal solution you provided. It gives error can not open file awk
Hello Vedanta,

I just tested with 3 files and it worked for me, could you please paste the exact error here? Also make sure you have at least read permissions to files in which you are trying to search the keyword.

Thanks,
R. Singh
Login or Register to Ask a Question

Previous Thread | Next Thread

10 More Discussions You Might Find Interesting

1. UNIX for Beginners Questions & Answers

awk script to extract transcript information from gff3 file

I need help to extract transcript information from gff3 file. Here is the input Chr01 JGI gene 82773 86941 . - . ID=Potri.001G000900;Name=Potri.001G000900 Chr01 JGI mRNA 82793 86530 . - . ID=PAC:27047814;Name=Potri.001G000900.1;pacid=27047814;longest=1;Parent=Potri.001G000900... (6 Replies)
Discussion started by: Maduranga
6 Replies

2. Shell Programming and Scripting

Extract information from file

Gents, If is possible please help. I have a big file (example attached) which contends exactly same value in column, but from column 2 to 6 these values are diff. I will like to compile for all records all columns like the example attached in .csv format (output.rar ).. The last column in the... (11 Replies)
Discussion started by: jiam912
11 Replies

3. Shell Programming and Scripting

Extract information from txt file

Hello! I need help :) I have a file like this: AA BC FG RF TT GH DD FF HH (a few number of rows and three columns) and I want to put the letters of each column in a variable step by step in order to give them as input in another script. So I would like to obtain: for the 1° loop:... (11 Replies)
Discussion started by: edekP
11 Replies

4. Shell Programming and Scripting

How to extract information from a file?

Hi, i have a file like this: <Iteration> <Iteration_iter-num>3</Iteration_iter-num> <Iteration_query-ID>lcl|3_0</Iteration_query-ID> <Iteration_query-def>G383C4U01EQA0A length=197</Iteration_query-def> <Iteration_query-len>197</Iteration_query-len> ... (9 Replies)
Discussion started by: the_simpsons
9 Replies

5. Shell Programming and Scripting

Extract various information from a log file

Hye ShamRock If you can help me with this difficult task for me then it will save my day Logs : ================================================================================================================== ... (4 Replies)
Discussion started by: SilvesterJ
4 Replies

6. Shell Programming and Scripting

extract information from a log file (last days)

I'm still new to bash script , I have a log file and I want to extract the items within the last 5 days . and also within the last 10 hours the log file is like this : it has 14000 items started from march 2002 to january 2003 awk '{print $4}' < *.log |uniq -c|sort -g|tail -10 but... (14 Replies)
Discussion started by: matarsak
14 Replies

7. Shell Programming and Scripting

Create shell script to extract unique information from one file to a new file.

Hi to all, I got this content/pattern from file http.log.20110808.gz mail1 httpd: Account Notice: close igchung@abc.com 2011/8/7 7:37:36 0:00:03 0 0 1 mail1 httpd: Account Information: login sastria9@abc.com proxy sid=gFp4DLm5HnU mail1 httpd: Account Notice: close sastria9@abc.com... (16 Replies)
Discussion started by: Mr_47
16 Replies

8. Shell Programming and Scripting

Extract information from Log file formatted

Good evening! Trying to make a shell script to parse log file and show only required information. log file has 44 fields and alot of lines, each columns separated by ":". log file is like: first_1:3:4:5:6:1:3:4:5:something:notinterested second_2:3:4:3:4:2 first_1:3:4:6:6:7:8 I am interested... (3 Replies)
Discussion started by: dummie55
3 Replies

9. Shell Programming and Scripting

extract and format information from a file

Hi, Following is sample portion of the file; <JDBCConnectionPool DriverName="oracle.jdbc.OracleDriver" MaxCapacity="10" Name="MyApp_DevPool" PasswordEncrypted="{3DES}7tXFH69Xg1c=" Properties="user=MYAPP_ADMIN" ShrinkingEnabled="false" ... (12 Replies)
Discussion started by: sujoy101
12 Replies

10. Shell Programming and Scripting

How to extract a piece of information from a huge file

Hello All, I need some assistance to extract a piece of information from a huge file. The file is like this one : database information ccccccccccccccccc ccccccccccccccccc ccccccccccccccccc ccccccccccccccccc os information cccccccccccccccccc cccccccccccccccccc... (2 Replies)
Discussion started by: Marcor
2 Replies
Login or Register to Ask a Question