How to replicate data using Uniq or awk


 
Thread Tools Search this Thread
Top Forums Shell Programming and Scripting How to replicate data using Uniq or awk
# 1  
Old 08-17-2008
How to replicate data using Uniq or awk

Hi,

I have this scenario; where there are two classes:- apple and orange.

1,2,3,4,5,6,apple
1,1,0,4,2,3,apple
1,3,3,3,3,4,apple
1,1,1,1,1,1,orange
1,2,3,1,1,1,orange

Basically for apple, i have 3 entries in the file, and for orange, I have 2 entries. Im trying to edit the file and find way to replicate the orange data to make it 3 entries.

Output:-
1,2,3,4,5,6,apple
1,1,0,4,2,3,apple
1,3,3,3,3,4,apple
1,1,1,1,1,1,orange
1,2,3,1,1,1,orange
1,1,1,1,1,1,orange

This would make it balance for both number of line contains apple and orange.
I have tried using Uniq but cant figure out further from that.

Please advise. THanks.
# 2  
Old 08-17-2008
How do you decide which "orange" line to duplicate? Is it always the first one?

Will it always be 3 and 2, or do those quantities vary? Is there other data in the file as well, or is that everything in the file?
# 3  
Old 08-17-2008
you mean you wanna replicate line 4 as line 6??
if so use...
head -4|tail -1 filename >> filename
this appends line 4 as line 6 in you file..
# 4  
Old 08-18-2008
Hi,

How do you decide which "orange" line to duplicate? Is it always the first one?
> It is always taking from the first one.
E.g if the data have

1,2,3,4,5,6,apple
1,1,0,4,2,3,apple
1,3,3,3,3,4,apple
1,1,0,4,2,3,apple
1,3,3,3,3,4,apple
1,1,1,1,1,1,orange
1,2,3,1,1,1,orange

So, it will have repeated of orange dataset from the first occurrence of orange until it fulfill the similar number of items of orange as apple:-
1,2,3,4,5,6,apple
1,1,0,4,2,3,apple
1,3,3,3,3,4,apple
1,1,0,4,2,3,apple
1,3,3,3,3,4,apple
1,1,1,1,1,1,orange
1,2,3,1,1,1,orange
1,1,1,1,1,1,orange
1,2,3,1,1,1,orange
1,1,1,1,1,1,orange

Will it always be 3 and 2, or do those quantities vary? Is there other data in the file as well, or is that everything in the file?
>The number can be 0,1,...100,.all the integers but no negative numbers.
The real data in the file contains more than 6 numbers with ",". There can be up to hundreds of numbers with ",". But i think it would be similar case handle using this small data example?



Thanks.
# 5  
Old 08-18-2008
Try this:

Code:
awk -F, '
        /apple/ { applecount++ }
        /orange/ { orangedata[++orangecount]=$0 }
        1 # print the line
        END {
                for (i=orangecount;i<applecount;i++) {
                        print orangedata[(i%orangecount)+1]
                }
        }
'

# 6  
Old 08-19-2008
try below perl script

Code:
sub RepeatArray{
	$ref=shift;
	@arr=@$ref;
	$num=shift;
	$len=$#arr+1;
	for($i=$len;$i<$num;$i++){
		$arr[$i]=$arr[$i%$len];
	}
	return \@arr;
}
$file=shift;
open(FH,"<$file");
while(<FH>){
	@arr=split(",",$_);
	$temp=$arr[$#arr];
	$_=~tr/\n//d;
	if($hash{$temp}){
		$hash{$temp}=sprintf("%s/%s",$hash{$temp},$_);
	}
	else{
		$hash{$temp}=$_;
	}
	$h{$arr[$#arr]}++;
}
close(FH);
@sum=sort {$b<=>$a;} values %h;
$max=$sum[0];
for $key (keys %hash){
	@arr=split("/",$hash{$key});
	$ref=RepeatArray(\@arr,$max);
	@res=@$ref;
	for($i=0;$i<=$#res;$i++){
		print $res[$i],"\n";
	}
}

Login or Register to Ask a Question

Previous Thread | Next Thread

10 More Discussions You Might Find Interesting

1. Shell Programming and Scripting

Need help in awk: running a loop with one column and segregate data 4 each uniq value in that field

Hi All, I have a file like this(having 2 column). Column 1: like a,b,c.... Column 2: having numbers. I want to segregate those numbers based on column 1. Example: file. a 5 b 9 b 620 a 710 b 230 a 330 b 1910 (4 Replies)
Discussion started by: Raza Ali
4 Replies

2. Shell Programming and Scripting

Combine data from two files base on uniq data

File 1 ID Name Po1 Po2 DD134 DD134_4A_1 NN-1 L_0_1 DD134 DD134_4B_1 NN-2 L_1_1 DD134 DD134_4C_1 NN-3 L_2_1 DD142 DD142_4A_1 NN-1 L_0_1 DD142 DD142_4B_1 NN-2 L_1_1 DD142 DD142_4C_1 NN-3 L_2_1 DD142 DD142_3A_1 NN-41 L_3_1 DD142 DD142_3A_1 NN-42 L_3_2 File 2 ( Combination of... (1 Reply)
Discussion started by: pareshkp
1 Replies

3. Shell Programming and Scripting

Filtering data using uniq and sed

Hello, Does anyone know an easy way to filter this type of file? I want to get everything that has score (column 2) 100.00 and get rid of duplicates (for example gi|332198263|gb|EGK18963.1| below), so I guess uniq can be used for this? gi|3379182634|gb|EGK18561.1| 100.00... (6 Replies)
Discussion started by: narachaid
6 Replies

4. Shell Programming and Scripting

Find the replicate record using awk

We usually use the following awk code to delete of find out the replicate record. awk -F, '{a++} END {for (i in a) if (a>=2) print i a}' file My question is how can I print the whole record. The following code doesn't work. awk -F, '{a++} END {for (i in a) if (a>=2) print $0}' file ... (8 Replies)
Discussion started by: xshang
8 Replies

5. Shell Programming and Scripting

replicate lines - awk

Is it possible to replicate the lines based on 4th column of the input like the below ? input ar1 10 100 -1 ar1 20 200 -2 arX 34 140 +1 arY 7 1 +4 output ar1 10 100 - ar1 20 200 - ar1 20 200 - arX 34 140 + arY ... (1 Reply)
Discussion started by: quincyjones
1 Replies

6. UNIX for Dummies Questions & Answers

Finding and Extracting uniq data in multiple files

Hi, I have several files that look like this: File1.txt Data1 Data2 Data20 File2.txt Data1 Data5 Data10 File3.txt Data1 Data2 Data17 File4.txt (6 Replies)
Discussion started by: Fahmida
6 Replies

7. Shell Programming and Scripting

Modify log files to get uniq data

Hello, I have a log file that has following output as below. LAP.sun5 CC LAP.sun5 CQ perl.sun5 CC perl.sun5 CQ TSLogger.sun5 CC TSLogger.sun5 CQ TSLogger.sun5 KR WAS.sun5 CC WAS.sun5 MT WAS.sun5 CQ I want to output to be in the way below, i tried using awk but could not do it. ... (12 Replies)
Discussion started by: asirohi
12 Replies

8. Shell Programming and Scripting

Help with uniq or awk??

Hi, my dilemna is this: example i got a file of fruit.txt which contains: Apple 6 Apple_new 7 old_orange 9 orange 10 Is there any way for me to have an output of Apple 13 Orange 19 using shell script: (6 Replies)
Discussion started by: shinoman28
6 Replies

9. Shell Programming and Scripting

Help needed with Sort and uniq data

Hi All, After Sorting directories and files i have got following output as below, now i only want the strings common in them, so the actual output should be as below in the bottom. How do i do that? Thanks -adsi File to be modified:- Common Components for ----> AA... (4 Replies)
Discussion started by: asirohi
4 Replies

10. Shell Programming and Scripting

using uniq and awk??

I have a file that is populated: hits/books.hits:143.217.64.204 Thu Sep 21 22:24:57 GMT 2006 hits/books.hits:62.145.39.14 Fri Sep 22 00:38:32 GMT 2006 hits/books.hits:81.140.86.170 Fri Sep 22 08:45:26 GMT 2006 hits/books.hits:81.140.86.170 Fri Sep 22 09:13:57 GMT... (13 Replies)
Discussion started by: amatuer_lee_3
13 Replies
Login or Register to Ask a Question