File manipulation with AWK and SED


 
Thread Tools Search this Thread
Top Forums Shell Programming and Scripting File manipulation with AWK and SED
# 1  
Old 12-01-2009
File manipulation with AWK and SED

Hello

How do i check that correct input files are used while using AWk and SED for file manipulation?

e.g

awk '/bin/ {print $0 }' shell.txt
sed 's/hp/samsung/' printers.txt

how do i ensure that the correct input files I am working with are used?
# 2  
Old 12-01-2009
Can you be more specific, I don't understand what you mean by checking your input files are "Correct". Are you refering to a "Correct" file format or "Correct" file name for example?
# 3  
Old 12-01-2009
I am referring to the correct file name.
# 4  
Old 12-01-2009
The answer would depend on the context.

In your examples the code is as would be written on the command line, in which case you would manually check you are in the correct directory and using the correct file. An answer so obvious I assume you must mean something more, perhaps performing some validation within a script prior to executing these commands??

As a general rule the more information and context you provide when posting, the more helpful and immediate the feed back you get.

If you are asking in the context of a specific script then try posting the relevant fragment between code tags.

Kind regards

steady
# 5  
Old 12-03-2009
Typical File Validation Scenario

OK as promised here is a typical script that uses awk to process a csv file created from an excel spreadsheet. It performs various validation checks on the files and directories involved.

The input file called "file_validation.csv": -

Code:
WORK ORDER,DESCRIPTION,PROJECT,RAISED BY,STATUS,ASSIGNED TO,DATE RAISED,DATE COMPLETED
WO_102,Automate delivery of desktop apps to testers,Metering,wateruchav,new,atkinsb,03/06/2007,13/06/2007
WO_105,Create logging script for desktop installations,Metering,patelb,completed,lanem,05/06/2007,12/06/2007
WO_106,Create tool to automatically deploy patches to metering environments,Metering,smithc,completed,atkinsb,08/06/2007,21/06/2007
WO_107,Create tool to gather metrics for the tools team work load,Jupiter,atkinsb,new,atkinsb,11/06/2007,28/06/2007

The script to process it called "file_validation": -

Code:
#! /bin/ksh
#############################################################################################################
##
##        Name                -        file_validation
##        Author              -        Bradley Atkins
##        Description         -        Example code to illustrate typical file validation 
##           techniques in a ksh shell script that uses awk to process a csv file
##           created from an excel spreadsheet.
##        Date                -        03/12/2009
##        Args                -        
##        Environment         -        
##        Return              -        1 Error
##                                     0 Success
##
#############################################################################################################
##-----------------------------------------------
## Functions
##-----------------------------------------------
tidyupexit()
{
 [[ -d /tmp/${DATE}_muse ]] && rm -rf /tmp/${DATE}_muse                     ## Use the full path rather than $TMP_DIR
 [[ -n $2 ]] && echo $2
 exit $1
}
get_metrics()
{
 ##-----------------------------------------------
 ## Return the requsted metrics
 ##-----------------------------------------------
 [[ $# -ne 4 ]] && tidyupexit 1 "Usage error, get_metrics()"
 FILE=$1
 [[ -r $FILE ]] || tidyupexit 1 "File not found / readable. get_metrics()"
 FIELD=$2
 TARGET=$3
 QUERY_TYPE=$4
 [[ $QUERY_TYPE == +([0-9]) ]] || tidyupexit 1 "None numeric query type"
 print $(nawk -F, -v f=$FIELD -v t=$TARGET -v qt=$QUERY_TYPE '
  ( (qt == 1) && (tolower($f) == tolower(t)) )
  ( (qt == 2) && (tolower($f) == tolower(t)) )
 ' $FILE | wc -l)
}
[[ $# -ne 1 ]] && tidyupexit 1 "Usage error. Incorrect parameter count, 1 expected <csv file>"
##-----------------------------------------------
## Initialise our variables etc
##-----------------------------------------------
MSCRIPTNAME=file_validation
DATE=$(date +'%Y%m%d')
TMP_DIR=/tmp/${DATE}_muse
RESULTS_DIR=~/${DATE}_muse.results
CSV_FILE=$1
[[ -r $CSV_FILE ]] || tidyupexit 1 "Input file not found / readable ($CSV_FILE)"
typeset cWORK_ORDER=1 cDESCRIPTION=2 cPROJECT=3 cRAISED_BY=4 cSTATUS=5 cASSIGNED_TO=6 cDATE_RAISED=7 cDATE_COMPLETED=8 \
DEVELOPERS="atkinsb lanem patelb" HEADER TMPSTR qtDEV=1 qtSTA=2 NUMBER TOTAL_NUMBER STATUS_STRINGS
STATUS_STRINGS="new completed in_progress code_review uat"
##-----------------------------------------------
## Create our temporary files etc
##-----------------------------------------------
mkdir -p $TMP_DIR || tidyupexit 1 "Failed to create temp directory"
TMPFILE1=$TMP_DIR/muse.${MSCRIPTNAME}.tmp1
SWAPFILE=$TMP_DIR/muse.${MSCRIPTNAME}.tmp2
##-----------------------------------------------
## Create our output files
##-----------------------------------------------
mkdir -p $RESULTS_DIR || tidyupexit 1 "Failed to create results directory"
DEV_CSV=${RESULTS_DIR}/developers.csv
STA_CSV=${RESULTS_DIR}/status.csv
##-----------------------------------------------
## Process the CSV file and get our metrics
##-----------------------------------------------
HEADER="DEVELOPER TOTALS"                         ## 
Developer Metrics
NUMBER=0
TOTAL_NUMBER=0
TMPSTR=""
for D in $DEVELOPERS;do
 typeset -u HEADER=${HEADER}","$D
 NUMBER=$(get_metrics $CSV_FILE $cASSIGNED_TO $D $qtDEV)
 TMPSTR=${TMPSTR}","$NUMBER
 TOTAL_NUMBER=$(( TOTAL_NUMBER + NUMBER ))
done
echo $HEADER > $DEV_CSV
echo ${TOTAL_NUMBER}${TMPSTR} >> $DEV_CSV
HEADER="STATUS TOTALS"                                          ## Status Metrics 
NUMBER=0
TOTAL_NUMBER=0
TMPSTR=""
for S in $STATUS_STRINGS;do
 typeset -u HEADER=${HEADER}","$S
 NUMBER=$(get_metrics $CSV_FILE $cSTATUS $S $qtSTA)
 TMPSTR=${TMPSTR}","$NUMBER
 TOTAL_NUMBER=$(( TOTAL_NUMBER + NUMBER ))
done
echo $HEADER > $STA_CSV
echo ${TOTAL_NUMBER}${TMPSTR} >> $STA_CSV

tidyupexit 0

Be sure to mess around with it before asking me any questions, it is much more satisfying to figure it out for yourself.

If you don't know how to trace the code while it is running let me know.

Enjoy Smilie
# 6  
Old 12-04-2009
MySQL

Thanks
Login or Register to Ask a Question

Previous Thread | Next Thread

10 More Discussions You Might Find Interesting

1. UNIX for Beginners Questions & Answers

File manipulation place 0 before the number using sed

I'm new with sed, and i am really confused with slashes, backslashes, parentheses, I've tried reading some beginner's guide, but still trouble fixing this problem, do you have any tips where or what to read to learn more about sed? can you also help me with my problem? Note: I was tasked to use... (4 Replies)
Discussion started by: akopocpoypoy
4 Replies

2. Shell Programming and Scripting

Text manipulation with sed/awk in a bash script

Guys, I have a variable in a script that I want to transform to into something else Im hoping you guys can help. It doesn't have to use sed/awk but I figured these would be the simplest. DATE=20160120 I'd like to transform $DATE into "01-20-16" and move it into a new variable called... (8 Replies)
Discussion started by: dendenyc
8 Replies

3. Shell Programming and Scripting

sed flat file manipulation

Hello, I have a large flat file where i need to change data in columns 131-133 based on what is in columns 172-173. I am not sure if I need to read the file line by line and make the change or if I can do this in a single statement. thank you (3 Replies)
Discussion started by: gblmin
3 Replies

4. UNIX for Dummies Questions & Answers

File manipulation via awk

Hello, I am having issues generating the output file below from this input file: Basically, what I want is if the ID= matches with the line below to print the first value in column 3 and the last value of column 4 for the matching ID's. The ID's can repeat more than twice, however, they... (2 Replies)
Discussion started by: verse123
2 Replies

5. Shell Programming and Scripting

setter and getter functions for file manipulation with sed

Hi, I would really appreciate some help, I couldn't nail my problem: I would like to create some setter and getter functions to make my life easier. my sample file contains: keyword - some tabs - value - semicolon number 12.1; float .3; double 12; real 12.2324; stuff .234; decimal... (5 Replies)
Discussion started by: Toorop
5 Replies

6. Shell Programming and Scripting

File manipulation in awk

I have got a sample file below(colon(:) is the field separator) . The data is like col1:col2:col3:col4:col5:col6:col7:col8:col9:col10 11:12:012:aa:a a a:10::111:12: 311:321:320:caad::321:31:3333:: 2:22:222::bbb::cads::2222:20 :::::12:1234::12: :5:55::555:5555::::55550 Now I want to find... (9 Replies)
Discussion started by: rinku11
9 Replies

7. Shell Programming and Scripting

SED/AWK file read & manipulation

I have large number of data files, close to 300 files, lets say all files are same kind and have extension .dat , each file have mulitple lines in it. There is a unique line in each file containing string 'SERVER'. Right after this line there is another line which contain a string 'DIGIT=0',... (4 Replies)
Discussion started by: sal_tx
4 Replies

8. Shell Programming and Scripting

File manipulation with awk

Could you please help me to achieve the below: In a file I need to convert the multiple lines whose filed 1 and field 5 values are same into a single line but with the field 4 values comma separed as mentioned below. Fileds after 5 shall be discarded. Also here by default all other remaining... (6 Replies)
Discussion started by: dhams
6 Replies

9. Shell Programming and Scripting

File manipulation using AWK

Hi All, I have a file having content, $ cat data1.txt 20060620 142 62310 959400 A 5.00 20060620 142 62310 959400 B 3.00 20060620 143 62310 959401 A 7.00 20060620 143 62310 959401 B 4.00 20060620 144 62310 959402 A 8.00 20060620 144 62310... (6 Replies)
Discussion started by: rinku11
6 Replies

10. Shell Programming and Scripting

file name Manipulation using sed

Hi, I have a file name, for which I want to strip out the first bit and leave the rest... So I want to take the file name .lockfile-filename.10001 ,strip it and have only filename.10001 ... Thanking you all inadvance, Zak (6 Replies)
Discussion started by: Zak
6 Replies
Login or Register to Ask a Question