The UNIX and Linux Forums  

Go Back   The UNIX and Linux Forums > Top Forums > UNIX for Dummies Questions & Answers
.
google unix.com



UNIX for Dummies Questions & Answers If you're not sure where to post a UNIX or Linux question, post it here. All UNIX and Linux newbies welcome !!

More UNIX and Linux Forum Topics You Might Find Helpful
Thread Thread Starter Forum Replies Last Post
Validating the input date format Sharmila_P Shell Programming and Scripting 6 08-01-2008 10:05 AM
fixed length fields in awk roopla Shell Programming and Scripting 2 11-13-2006 09:12 PM
validating input using regular expressions nrodolfich Shell Programming and Scripting 1 02-27-2006 02:45 PM
Validating fixed length field... giannicello UNIX for Dummies Questions & Answers 12 05-22-2003 12:19 PM
validating input ruffenator High Level Programming 4 04-24-2002 09:30 AM

Closed Thread
English Japanese Spanish French German Portuguese Italian Dutch Swedish Russian Norwegian Hungarian Hebrew Danish Bulgarian Greek Powered by Powered by Google
 
LinkBack Thread Tools Search this Thread Rate Thread Display Modes
  #1 (permalink)  
Old 02-09-2009
Dipali Dipali is offline
Registered User
  
 

Join Date: Feb 2009
Posts: 4
Smile Validating input based on fixed number of fields

[post moved]

Yes, i did... let me state my problem in more detail
Inputs:
I have one input CSV file
And, i have stored no. of comma each line should in a variable.
e.g.

$ cat cmt.csv
this, is a ,comma ,count test1
,,this, is a ,comma ,count test2
this, is a ,comma ,count test3
Thisisaline,withoutspace,test1
Thisisaline,withoutspace,test2
this, line is ,lacking comma test1
this, line is ,lacking comma test2

and,
export row_comma=3

Now, requirement is:
if, in a line, no. of commas not equal to $row_comma then write this line into cmt.bad, and remove this line from cmt.csv.
To achieve this, i hv written something as below. which is working fine if there are NO spaces in input cmt.csv... but whenever there is a space its a mess.

#$FEED_DIR/$feed_nm=cmt.csv
for k in `cat $FEED_DIR/$feed_nm`
do
echo $k > $FEED_DIR/tmp.lst
line_comma_cnt=`awk ' BEGIN { count=0; }
{ for (i=1;i<=length($0);i++)
{
if(substr($0,i,1)=="," ) {count++;}
}
}
END {print count} ' $FEED_DIR/tmp.lst`
echo "New count: $line_comma_cnt"

#rm -f $EXCEPTION_DIR/${feed_nm}_insufficient_data.bad
rm -f $FEED_DIR/$feed_nm.tmp

#echo line comma count $line_comma_cnt and row comma count is $row_comma

if [[ $line_comma_cnt != $row_comma ]]
then
# echo "count mismatch, prepare separate files"
# cp $FEED_DIR/$feed_nm $FEED_DIR/$feed_nm.orig
#echo line is $k
# grep `cat $FEED_DIR/tmp.lst ` >> $FEED_DIR/lack.txt
grep $k $FEED_DIR/$feed_nm >> $EXCEPTION_DIR/${feed_nm}_insufficient_data.bad
# grep -v $k $FEED_DIR/$feed_nm > $FEED_DIR/$feed_nm.tmp
# mv $FEED_DIR/$feed_nm.tmp $FEED_DIR/$feed_nm
fi
done

#echo intial insuff count is `wc -l $EXCEPTION_DIR/${feed_nm}_insufficient_data.bad`

for d in `cat $EXCEPTION_DIR/${feed_nm}_insufficient_data.bad `
do
#echo "in test area `wc -l $EXCEPTION_DIR/${feed_nm}_insufficient_data.bad` "
grep -v $d $FEED_DIR/$feed_nm > $FEED_DIR/${feed_nm}.tmp
#echo after temp `wc -l $FEED_DIR/${feed_nm}.tmp `

mv $FEED_DIR/${feed_nm}.tmp $FEED_DIR/$feed_nm
# cp $FEED_DIR/${feed_nm}.tmp $FEED_DIR/$feed_nm
done

i will really appreciate, if someone help me out to get the desired result in optimal way.

Thanks
Dipali

Last edited by radoulov; 02-09-2009 at 08:09 AM.. Reason: new thread opened
  #2 (permalink)  
Old 02-09-2009
radoulov's Avatar
radoulov radoulov is offline Forum Staff  
addict
  
 

Join Date: Jan 2007
Location: Варна, България / Milano, Italia
Posts: 2,926
You can try something like this (backup your data first, use gawk, nawk or /usr/xpg4/bin/awk on Solaris):


Code:
awk 'BEGIN {
  FS = ","; te = "tmp"; be = "bad"
  ffn = fn = ARGV[2]; sub(/[^.]*$/,"", fn)
  }
{ 
  print > (fn (NF != nf + 1 ? be : te)) 
  }
END {
  if (system("mv " fn te OFS ffn)) {
    print "error moving", fn te, "to", ffn | "cat >&2"
    exit 1
    }
  }' nf="$row_comma" cmt.csv

  #3 (permalink)  
Old 02-09-2009
Dipali Dipali is offline
Registered User
  
 

Join Date: Feb 2009
Posts: 4
Thanks for your reply....
But this block is giving errorpossibly the simple one... but i m not good in awk programming), so pls suggest what wrong i m doing

awk 'BEGIN {
> FS = ","; te = "tmp"; be = "bad"
> ffn = fn = ARGV[2]; sub(/[^.]*$/,"", fn)
> }
> {
> print > (fn (NF != nf + 1 ? be : te))
> }
> END {
> if (system("mv " fn te OFS ffn)) {
> print "error moving", fn te, "to", ffn | "cat >&2"
> exit 1
> }
> }' nf="$row_comma" cmt.csv
mv: cannot stat `cmt.tmp': No such file or directory
error moving cmt.tmp to cmt.csv
  #4 (permalink)  
Old 02-09-2009
radoulov's Avatar
radoulov radoulov is offline Forum Staff  
addict
  
 

Join Date: Jan 2007
Location: Варна, България / Milano, Italia
Posts: 2,926
It seems there are no valid records in cmt.csv ...
Try this one:


Code:
awk 'BEGIN {
  FS = ","; te = "tmp"; be = "bad"
  ffn = fn = ARGV[2]; sub(/[^.]*$/,"", fn)
  }
{ 
  print > (fn (NF != nf + 1 ? be : te)) 
  }
END {
  if (system("[ -f " fn te " ] && mv " fn te OFS ffn)) {
    print "error moving", fn te, "to", ffn, \
    "or no valid records found" | "cat >&2"
    exit 1
    }
  }' nf="$row_comma" cmt.csv

  #5 (permalink)  
Old 02-09-2009
Franklin52 Franklin52 is offline Forum Staff  
Moderator
  
 

Join Date: Feb 2007
Posts: 4,342
Another approach:


Code:
awk -F"," '
NF != n+1 {print > "cmt.bad"; next}
1' n=$row_comma cmt.csv > cmt.new 

mv cmt.new cmt.csv

Regards.
  #6 (permalink)  
Old 02-10-2009
Dipali Dipali is offline
Registered User
  
 

Join Date: Feb 2009
Posts: 4
Thanks a lot Radoulov & Franklin52 for such a compact solutions, And 3 liner approach is wonderful !!!
well, both approches working fine for me. Now i need little more enhancement in it that when i merged this piece of code into my shell script where i need to create bad file name dynamically e.g.

awk -F"," '
NF != n+1 {print > "$feed_nm.bad"; next}
1' n=$row_comma $FEED_DIR/$feed_nm > $FEED_DIR/$feed_nm.new

cp $FEED_DIR/$feed_nm.new $FEED_DIR/$feed_nm

So, can you please guide me here as well.

Thanks
Dipali
  #7 (permalink)  
Old 02-10-2009
Franklin52 Franklin52 is offline Forum Staff  
Moderator
  
 

Join Date: Feb 2007
Posts: 4,342
Quote:
Originally Posted by Dipali View Post
Thanks a lot Radoulov & Franklin52 for such a compact solutions, And 3 liner approach is wonderful !!!
well, both approches working fine for me. Now i need little more enhancement in it that when i merged this piece of code into my shell script where i need to create bad file name dynamically e.g.

awk -F"," '
NF != n+1 {print > "$feed_nm.bad"; next}
1' n=$row_comma $FEED_DIR/$feed_nm > $FEED_DIR/$feed_nm.new

cp $FEED_DIR/$feed_nm.new $FEED_DIR/$feed_nm

So, can you please guide me here as well.

Thanks
Dipali
Try something like this:


Code:
feed_nm.bad="cmt.bad"

awk -F"," '
NF != n+1 {print > bad; next}
1' n=$row_comma bad="$feed_nm.bad" $FEED_DIR/$feed_nm > $FEED_DIR/$feed_nm.new

Regards
Closed Thread

Bookmarks

Thread Tools Search this Thread
Search this Thread:

Advanced Search
Display Modes Rate This Thread
Rate This Thread:

Posting Rules
You may not post new threads
You may not post replies
You may not post attachments
You may not edit your posts

BB code is On
Smilies are On
[IMG] code is On
HTML code is Off
Trackbacks are On
Pingbacks are On
Refbacks are On




All times are GMT -4. The time now is 11:35 PM.


Powered by: vBulletin, Copyright ©2000 - 2006, Jelsoft Enterprises Limited. Language Translations Powered by .
vBCredits v1.4 Copyright ©2007 - 2008, PixelFX Studios
The UNIX and Linux Forums Content Copyright ©1993-2009. All Rights Reserved.Ad Management by RedTyger

Content Relevant URLs by vBSEO 3.2.0