Filtering Data


 
Thread Tools Search this Thread
Top Forums UNIX for Dummies Questions & Answers Filtering Data
# 1  
Old 09-19-2007
Data Filtering Data

file1 contain: (this just a small sample of data it may have thousand of lines)

1 aaa 1/01/1975 delhi
2 bbb 2/03/1977 mumbai
3 ccc 1/01/1975 mumbai
4 ddd 2/03/1977 chennai
5 aaa 1/01/1975 kolkatta
6 bbb 2/03/1977 bangalore

program:

nawk '{
idx= $2 SUBSEP $3
arr[idx] = (idx in arr) ? arr[idx] ORS $0 : $0
arrCnt[idx]++
}
END {
for (i in arr)
if (arrCnt[i] > 1) print arr[i]
}' file1

Result:

2 bbb 2/03/1977 mumbai
6 bbb 2/03/1977 bangalore
1 aaa 1/01/1975 delhi
5 aaa 1/01/1975 kolkatta

Questions:

How the code should be if I need the data result to be like this :


1 aaa 1/01/1975 delhi
3 ccc 1/01/1975 mumbai
2 bbb 2/03/1977 mumbai
4 ddd 2/03/1977 chennai

Please help! Thank you friends!
# 2  
Old 09-19-2007
More Info

It is good that you have supplied 'in' file format and wanted/desired 'out' format but it will help if you can tell us what the actual need is.

What is the algorithm/method you want to apply to the input to generate the output - it is not too clear from your examples. I am guessing you want to get entries from aaa until aaa is repeated. Is this the case?
# 3  
Old 09-19-2007
Data

The requests are:

the program will go line #1; it will compare this line #1 to every line in the file; and if column #2 is not match AND column #3 is match to the next lines, then print all the lines those meet the condition. The program will go to line #2 and repeat the process til the end of the file.

I am not sure this is clear enough. Please let me know.
 
Login or Register to Ask a Question

Previous Thread | Next Thread

10 More Discussions You Might Find Interesting

1. Shell Programming and Scripting

Need help Filtering Data from an API

Hi Everyone, I need help on figuring out a way to filter some data that I get back from an API. Im able to get all the data that Im looking for but I would like to know a way for me to filter it better. The data that Im getting back is basically 2 rows of data as seen here. Row 1 ... (25 Replies)
Discussion started by: TheStruggle
25 Replies

2. Shell Programming and Scripting

Data filtering and category assigning

Please consider the following file, I have many groups which can be of 3 types, T1 (Serial_Number 1) T2 (Serial_Number 2) and T1*T2 (all other Serial_Number). I want to only consider groups that have both T1 and T2 present and their values are different from each other. In the example file,... (8 Replies)
Discussion started by: jianp83
8 Replies

3. Shell Programming and Scripting

Filtering out the data with dates

Hi, I have some data like seen below. format : apple(hhmm mm/dd).fruit apple(2345 03/25).fruit apple(2345 05/06).fruit orange(0443 05/02).fruit orange(0345 05/05).fruit orange(2134 05/04).fruit grape(0930 04/24).fruit grape(2330 03/30).fruit I need to get the data which are... (1 Reply)
Discussion started by: jayadanabalan
1 Replies

4. Shell Programming and Scripting

Filtering data using uniq and sed

Hello, Does anyone know an easy way to filter this type of file? I want to get everything that has score (column 2) 100.00 and get rid of duplicates (for example gi|332198263|gb|EGK18963.1| below), so I guess uniq can be used for this? gi|3379182634|gb|EGK18561.1| 100.00... (6 Replies)
Discussion started by: narachaid
6 Replies

5. Homework & Coursework Questions

awk - filtering data by if --> into an array

Use and complete the template provided. The entire template must be completed. If you don't, your post may be deleted! 1. The problem statement, all variables and given/known data: my data in csv-format ... ... 13/08/2012,16:30,303.30,5.10,3,2,2,1,9360.0,322... (13 Replies)
Discussion started by: IMPe
13 Replies

6. Shell Programming and Scripting

awk data filtering

I am trying to filter out some data with awk. If someone could help me that would be great. Below is my input file. Date: 10-JUN-12 12:00:00 B 0: 00 00 00 00 10 00 16 28 B 120: 00 00 00 39 53 32 86 29 Date: 10-JUN-12 12:00:10 B 0: 00 00 00 00 10 01 11 22 B 120: 00 00 00 29 23 32 16 29... (5 Replies)
Discussion started by: thibodc
5 Replies

7. Shell Programming and Scripting

Filtering data using AWK

Hi , i have file with delimiter as "|" and data in Double codes for all fields. how to filter data in a column like awk -F"|" '$1="asdf" {print $0}' test. ex : "asdf"|"zxcv" Thanks, Soma (1 Reply)
Discussion started by: challamsomu
1 Replies

8. Shell Programming and Scripting

help need in filtering data

Hello Gurus, Please help me out of the problem. I ve a input file as below input clock; input a; //reset all input b; //input comment output c; output d; output e; input f; //output comment I need the output as follows: \\Inputs (1 Reply)
Discussion started by: user_prady
1 Replies

9. Shell Programming and Scripting

Filtering Data

Hi All, I have the below input and expected ouput. I need a code which can scan through this input file and if the number in column1 is more than 1 , it will print out the whole line, else it will output "No Re-occurrence". Can anybody help ? Input: 1 vvvvv 20 7 7 23 0 64 6 zzzzzz 11 5... (7 Replies)
Discussion started by: Raynon
7 Replies

10. UNIX for Dummies Questions & Answers

Filtering out data ...

I have following command which tells me File size in GBs which are greater than 0.01GBs recursively in a dir structure. ls -l -R | awk '{ if ($5/1073741824 >= 0.01) print $9, $5/1073741824 }' But there are some files whom I dont have enough permissions, after executing this script gives me... (1 Reply)
Discussion started by: videsh77
1 Replies
Login or Register to Ask a Question