The UNIX and Linux Forums  

Go Back   The UNIX and Linux Forums > Top Forums > Shell Programming and Scripting
.
google unix.com



Shell Programming and Scripting Post questions about KSH, CSH, SH, BASH, PERL, PHP, SED, AWK and OTHER shell scripts and shell scripting languages here.

More UNIX and Linux Forum Topics You Might Find Helpful
Thread Thread Starter Forum Replies Last Post
Need awk script for removing duplicate records nmumbarkar Linux 6 04-09-2009 02:05 PM
find duplicate records... again rleal Shell Programming and Scripting 4 01-28-2009 06:30 PM
How to remove duplicate records with out sort svenkatareddy Shell Programming and Scripting 19 06-11-2008 03:10 PM
How to remove duplicate records with out sort svenkatareddy SUN Solaris 2 02-28-2008 08:38 AM
Records Duplicate ganesh123 Shell Programming and Scripting 9 02-22-2007 08:47 AM

Reply
English Japanese Spanish French German Portuguese Italian Dutch Swedish Russian Norwegian Hungarian Hebrew Danish Bulgarian Greek Powered by Powered by Google
 
LinkBack Thread Tools Search this Thread Rate Thread Display Modes
  #1 (permalink)  
Old 10-26-2009
kshuser kshuser is offline
Registered User
  
 

Join Date: Oct 2009
Posts: 30
combine duplicate records

I have a .DAT file like below

Code:
23666483030000653-B94030001OLFXXX000000120081227
23797049900000654-E71060001OLFXXX000000220081227
23699281320000655 E71060002OLFXXX000000320081227
22885068900000652 B86860003OLFXXX592123320081227
22885068900000652 B86860003ODL-SP592123420081227
22885068900000652-B94030001ODL-CH592123520081227
 

I would like to combine duplicate records into a single record with the new single record containing additional fields appending at the end of line record (for example see below ) . In the example file above, the first field is the unique field. So I would like my output to be like below:

If any duplicate record exists in this case 288506890 has 3 records check only for the position ODL-SP & ODL-CH
if ODL-SP exists then get the amount position 34:40
if ODL-CH exists then get the amount position 34:40

then get/append the final record (288506890) for this no is like below, if no duplcate record exists just create the line as is
Code:
2885068900000652 B86860003OLFXXX592123320081227  5921234 5921235
Code:
ben_type=`echo $line|cut -c28-33`
(you get ODL-SP spouse, ODL-CH child)
amount=`echo $line|cut -c34-40`
(you get spouse=5921234, child=5921235)
Code:
23666483030000653-B94030001OLFXXX000000120081227
23797049900000654-E71060001OLFXXX000000220081227
23699281320000655 E71060002OLFXXX000000320081227
22885068900000652 B86860003OLFXXX592123320081227 5921234 5921235



Can someone please please help me with a solution using Unix ksh scripting Thank you.


Last edited by vgersh99; 10-26-2009 at 02:26 PM.. Reason: code tags, please!
  #2 (permalink)  
Old 10-26-2009
binlib binlib is offline
Registered User
  
 

Join Date: Aug 2009
Location: New Jersey
Posts: 61
With a name like kshuser and asked for ksh only solution, I assume you use ksh93.
Code:
while read x; do
  k=${x:0:17}
  if [ "$k" = "$ok" ]; then
    p="$p ${x:33:7}"
  else
    [ -n "$p" ] && echo "$p"
    ok=$k
    p=$x
  fi
done
echo "$p"
If you can add a blank line at the end of file (e.g.
Code:
(cat file;echo)
), you can omit the last echo outside the loop.
  #3 (permalink)  
Old 10-26-2009
kshuser kshuser is offline
Registered User
  
 

Join Date: Oct 2009
Posts: 30
I am kind of new to KSH scripting.

FILE1.DAT has the following records.

Code:
23666483030000653-B94030001ODL-Ch000000120081227
23797049900000654-E71060001OLFXXX000000220081227
23699281320000655 E71060002OLFXXX000000320081227
22885068900000652 B86860003OLFXXX592123320081227
22885068900000652 B86860003ODL-Sp592123420081227
22885068900000652-B94030001ODL-Ch592123520081227
i am writing the script like below and getting the outfile mondaytest.txt

Code:
rec_cnt=1
while read line
do
no=`echo $line|cut -c2-10`
ben_type=`echo $line|cut -c28-33`
amount=`echo $line|cut -c34-40`
if [[ $rec_cnt -eq 1 ]]
then
echo $line >> mondaytest.txt
prior_no=$no
prev_line=$line
else
if [[ $no -eq $prior_no ]]
then
if [[ $ben_type = "ODL-SP" ]]
then
spouse_amt=$amount
prev_line="$prev_line $spouse_amt"
elif [[ $ben_type = "ODL-CH" ]]
then
child_amt=$amount
#prev_line="$prev_line $spouse_amt"
else 
echo 'invalid ben_type'
fi
#echo $prev_line $spouse_amt $child_amt>> mondaytest.txt
echo 'Insert_1' $prev_line $child_amt >> mondaytest.txt
else
echo 'Insert_2' $line >> mondaytest.txt
prev_line=$line
fi
spouse_amt=""
child_amt=""
fi 
(( rec_cnt=rec_cnt + 1 )) 
prior_no=$no
done <FILE.DAT
OUT FILE mondaytest.txt
Code:
23666483030000653-B94030001ODL-Ch000000120081227
23797049900000654-E71060001OLFXXX000000220081227
23699281320000655 E71060002OLFXXX000000320081227
22885068900000652 B86860003OLFXXX592123320081227
22885068900000652 B86860003OLFXXX592123320081227 5921234
22885068900000652 B86860003OLFXXX592123320081227 5921234 5921235
I want the outfile should have only 4 records like this.
Code:
23666483030000653-B94030001ODL-Ch000000120081227
23797049900000654-E71060001OLFXXX000000220081227
23699281320000655 E71060002OLFXXX000000320081227
22885068900000652 B86860003OLFXXX592123320081227 5921234 5921235
Can you please correct me in my code to get the above expected result.

Quote:
Originally Posted by binlib View Post
With a name like kshuser and asked for ksh only solution, I assume you use ksh93.
Code:
while read x; do
  k=${x:0:17}
  if [ "$k" = "$ok" ]; then
    p="$p ${x:33:7}"
  else
    [ -n "$p" ] && echo "$p"
    ok=$k
    p=$x
  fi
done
echo "$p"
If you can add a blank line at the end of file (e.g.
Code:
(cat file;echo)
), you can omit the last echo outside the loop.

Last edited by vgersh99; 10-26-2009 at 02:23 PM.. Reason: code tags, please!
  #4 (permalink)  
Old 10-26-2009
Scrutinizer Scrutinizer is online now
Registered User
  
 

Join Date: Nov 2008
Posts: 705
Quote:
Originally Posted by binlib View Post
If you can add a blank line at the end of file (e.g.
Code:
(cat file;echo)
), you can omit the last echo outside the loop.
E.g. like so?

Code:
#!/bin/ksh
 echo|cat infile -|while read line; do
  case ${line:27:6} in
    ODL-SP|ODL-CH)
        prev+=" ${line:33:7}" ;;
    *)  [[ -n $prev ]] && print $prev
        prev=$line ;;
  esac
done > outfile
  #5 (permalink)  
Old 10-26-2009
kshuser kshuser is offline
Registered User
  
 

Join Date: Oct 2009
Posts: 30
But when i ran your code it is generating the outfile file but no changes compared to INPUT file.

Code:
>echo|cat FILE2.DAT -|while read line
> do
> case {$line:27:6} in
> ODL-SP|ODL-CH)
> prev+=" ${line:33:7}" ;;
> *) [[ -n $prev ]] && print $prev
> prev=$line ;;
> esac
> done > OUT.txt
OUT.txt ...is the same as input file FILE2.DAT

Code:
23666483030000653-B94030001OLFXXX000000120081227
23797049900000654-E71060001OLFXXX000000220081227
23699281320000655 E71060002OLFXXX000000320081227
22885068900000652 B86860003OLFXXX592123320081227
22885068900000652 B86860003ODL-SP592123420081227
22885068900000652-B94030001ODL-CH592123520081227

Last edited by vgersh99; 10-26-2009 at 06:20 PM.. Reason: added code tags - charged 5K bits
  #6 (permalink)  
Old 10-26-2009
Scrutinizer Scrutinizer is online now
Registered User
  
 

Join Date: Nov 2008
Posts: 705
Code:
${line:27:6}
  #7 (permalink)  
Old 10-26-2009
danmero danmero is offline Forum Advisor  
  
 

Join Date: Nov 2007
Location: 45.48-73.63
Posts: 1,432
What about:
Code:
# awk 'NF{a[substr($0,0,9)]=(a[substr($0,0,9)])?a[substr($0,0,9)] FS substr($0,34,7):$0}END{for(i in a)print a[i]}' file
22885068900000652 B86860003OLFXXX592123320081227 5921234 5921235
23797049900000654-E71060001OLFXXX000000220081227
23666483030000653-B94030001OLFXXX000000120081227
23699281320000655 E71060002OLFXXX000000320081227
Reply

Bookmarks

Thread Tools Search this Thread
Search this Thread:

Advanced Search
Display Modes Rate This Thread
Rate This Thread:

Posting Rules
You may not post new threads
You may not post replies
You may not post attachments
You may not edit your posts

BB code is On
Smilies are On
[IMG] code is On
HTML code is Off
Trackbacks are On
Pingbacks are On
Refbacks are On




All times are GMT -4. The time now is 01:27 AM.


Powered by: vBulletin, Copyright ©2000 - 2006, Jelsoft Enterprises Limited. Language Translations Powered by .
vBCredits v1.4 Copyright ©2007 - 2008, PixelFX Studios
The UNIX and Linux Forums Content Copyright ©1993-2009. All Rights Reserved.Ad Management by RedTyger

Content Relevant URLs by vBSEO 3.2.0