Beginner: Count & Sort Using Array's


 
Thread Tools Search this Thread
Top Forums Shell Programming and Scripting Beginner: Count & Sort Using Array's
# 1  
Old 02-04-2011
Beginner: Count & Sort Using Array's

Hi,

I'm new to linux & bash so please forgive my ignorance, just wondering if anyone can help.

I have a file (mainfile.txt) with comma deliminated values, like so:
Code:
 
    $1  $2  $3
613212, 36, 57
613212, 36, 10
613212, 36, 10
677774, 36, 57
619900, 10, 10

i need to split this file into two files. Any entries where the first number ($1) is in more than once needs to go in file1.txt. Any entries that only occur once need to go into file2.txt, like so:

File1.txt
Code:
613212, 36, 57
613212, 36, 10
613212, 36, 10

File2.txt
Code:
677774, 36, 57
619900, 10, 10

Any solution to this problem would be greatly appreciated, no matter how implemented.


however, i would like to get experience using arrays so a solution incorporating them would be fantastic. i've experimented with them, like this, but cant get my head around them.
Code:
awk ' { arr[$0]++ } END { for( str in arr  ) { if ( arr[str] = 1 ) print str " " arr[str] } } ' /directory/mainfile.txt | >/directory/file1.txt
 
awk ' { arr[$0]++ } END { for( str in arr  ) { if ( arr[str] > 1 ) print str " " arr[str] } } ' /directory/mainfile.txt | >/directory/file2.txt

It's a botch job and not surprisingly doesnt work. even if it did, how do i specify $1 and only $1 as the thing to look at, instead of a string?

Cheers,

IanSmilie

Last edited by Scott; 02-04-2011 at 12:05 PM.. Reason: Please use code tags
# 2  
Old 02-04-2011
There are probably more efficient ways as this involves two passes:
Code:
awk ' {arr[$1]++; next} END{for (i in arr) {print arr[i], i }} ' mainfile.txt > t.tmp
awk ' FILENAME=="t.tmp" {arr[$2]=$1; next}
       FILENAME=="mainfile.txt" { if(arr[$1]>1) 
                                                   {print$0 >"file1.txt" }  
                                             else 
                                                  {print $0 >"file2.txt" } }' t.tmp mainfile.txt

# 3  
Old 02-04-2011
Hi Jim,

Thanks for the reply. It's almost there but not quite.

After running, my Files are as follows:

Code:
$ cat /mainfile.txt
613212 36 57
613212 36 10
613212 36 10
677774 36 57
619900 10 10

$ cat /t.tmp
1 619900
3 613212
1 677774

$ cat /file1.txt  -  *empty*

$ cat /file2.txt
613212 36 57
613212 36 10
613212 36 10
677774 36 57
619900 10 10

the t.tmp file is spot on - it know there's three instances of 613212 and one each of the others.

However, when it comes to sending them to files, file1.txt contains nothing when it should have three entries (all 613212) - and file2.txt should have the other two entries.

i've tried changing certain things but i'm shooting in the dark here.

Thanks again for helping.

Cheers

Ian

Last edited by Scott; 02-04-2011 at 01:23 PM.. Reason: Be a good fellow, and please use code tags.
# 4  
Old 02-04-2011
try:
Code:
awk 'NR==FNR && ++a[$1] >1{b[$1]=1} NR!=FNR{if(b[$1]) {print >>"file1"} else {print >> "file2"}}' file file

# 5  
Old 02-05-2011
Code:
awk -F, 'A[$1]++{print>"file2.txt";next}1' mainfile.txt >file1.txt


Last edited by Scrutinizer; 02-05-2011 at 08:47 AM..
# 6  
Old 02-05-2011
Quote:
Originally Posted by Scrutinizer
Code:
awk -F, 'A[$1]++{print>"file2.txt";next}1' mainfile.txt >file1.txt

@Scrutinizer, the first row of lines with more than 1 occurence of $1 goes to the wrong file.

Another approach:
Code:
awk -F, 'NR==FNR{a[$1]++;next}{print > (a[$1]>1?"File1.txt":"File2.txt")}' file file

# 7  
Old 02-05-2011
@Franklin. That's right Smilie . I misread the post Smilie
Login or Register to Ask a Question

Previous Thread | Next Thread

10 More Discussions You Might Find Interesting

1. Solaris

Connection Logging in Solaris 10 & 11 - Beginner

Excuse my ignorance as I am very new to working with Solaris. I'm looking for documentation on how to create a network log in Solaris 10 & 11. I don't wish to edit any of the logs currently the system. I simply want a log that will capture all incoming IP addresses and log them with a time-in... (8 Replies)
Discussion started by: FamousAv8er
8 Replies

2. Shell Programming and Scripting

Sort multidimensional Array

Hello I have a problem. I create a Multidimensional Array Like this: ENTRY="$kunnum-$host" ENTRY="$host" ENTRY="# $3" for key in "${!ENTRY}"; do ENTRIES=${ENTRY} # INDEX=IP(5) donedeclare -p declare -A ENTRIES=(="unas15533" ="unas" ="# RDP-Terminal 2"... (12 Replies)
Discussion started by: Marti95
12 Replies

3. UNIX for Beginners Questions & Answers

Difference of Sort -n -k2 -k3 & Sort -n -k2,3

Hi, Could anyone kindly show me a link or explain the difference between sort -n -k2 -k3 & sort -n -k2,3 Also, if I like to remove the row with repetition at both $2 and $3, Can I safely use sort -u -k2 -k3 Example; 100 20 30 100 20 30 So, both $2 and $3 are same and I... (2 Replies)
Discussion started by: Indra2011
2 Replies

4. Shell Programming and Scripting

File Move & Sort by Name - Kick out Bad File Names & More

I have a dilemma, we have users who are copying files to "directory 1." These images have file names which include the year it was taken. I need to put together a script to do the following: Examine the file naming convention, ensuring it's the proper format (e.g. test-1983_filename-123.tif)... (8 Replies)
Discussion started by: Nvizn
8 Replies

5. Shell Programming and Scripting

Array Count and Array Copy help

Hi all, I have the code as below " echo "File carried list after 1st loop "${fileStreamAssiagnInit}"" and I have the out put for the above code as below : Output : File carried list after 1st loop abcInd.csv sdgUS.csv sopSing.csv Here i want to count the number of elements in... (3 Replies)
Discussion started by: Balasankar
3 Replies

6. Shell Programming and Scripting

[Beginner's questions] Filename Validation & Parsing

Hi !! I'm rather new both to the UNIX and scripting worlds, and I'm learning the ropes of scripting. Having said this, please excuse me if you notice certain basic errors. I'm working on a script that implements .jar and .war files for a WAS environment and I need to perform certain... (4 Replies)
Discussion started by: levaldez
4 Replies

7. Shell Programming and Scripting

Sort a the file & refine data column & row format

cat file1.txt field1 "user1": field2:"data-cde" field3:"data-pqr" field4:"data-mno" field1 "user1": field2:"data-dcb" field3:"data-mxz" field4:"data-zul" field1 "user2": field2:"data-cqz" field3:"data-xoq" field4:"data-pos" Now i need to have the date like below. i have just... (7 Replies)
Discussion started by: ckaramsetty
7 Replies

8. UNIX for Dummies Questions & Answers

Sort with respect to count

Hello! I have a file with 4 columns. I am trying to have it sort first with respect to the first column, and then with respect to the number of counts (in descending count) in the second column within the same first column identity. For example: Input: 1 2 A 1 1 6 B 2 2 5 G 7 1 6 D 4... (8 Replies)
Discussion started by: anchuz
8 Replies

9. Shell Programming and Scripting

How to sort by count

Hello! I have a file with 4 columns. I am trying to have it sort first with respect to the first column, and then with respect to the number of counts (in descending count) in the second column within the same first column identity. For example: Input: 1 2 A 1 1 6 B 2 2 5 G 7 1 6 D 4 1... (1 Reply)
Discussion started by: anchuz
1 Replies

10. Shell Programming and Scripting

Sort and count using AWK

Hi, I've a fixed width file where I need to count the number of patterns from each line between characters 1 to 15 . so can we sort them and get a count for each pattern on the file between 1 to 15 characters. 65795648617522383763831552 410828003265795648 6175223837... (5 Replies)
Discussion started by: rudoraj
5 Replies
Login or Register to Ask a Question