awk - filtering data by if --> into an array

Login or Register for Dates, Times and to Reply

 
Thread Tools Search this Thread
Homework and Emergencies Homework & Coursework Questions awk - filtering data by if --> into an array
# 1  
awk - filtering data by if --> into an array

Use and complete the template provided. The entire template must be completed. If you don't, your post may be deleted!

1. The problem statement, all variables and given/known data:
my data in csv-format ...
Code:
...
13/08/2012,16:30,303.30,5.10,3,2,2,1,9360.0,322
13/08/2012,16:40,305.50,5.00,3,2,2,1,12360.0,322
13/08/2012,16:50,319.90,3.80,3,2,1,0,2280.0
13/08/2012,17:00,326.10,3.50,3,2,1,1,4380.0,321
13/08/2012,17:10,333.00,3.80,3,3,1,0,2280.0
13/08/2012,17:20,330.20,4.50,3,3,2,0,2700.0
13/08/2012,17:30,332.00,4.20,3,3,2,1,5220.0,332
13/08/2012,17:40,334.90,3.20,3,3,1,0,1920.0
13/08/2012,17:50,337.30,3.10,3,3,1,1,3780.0,331
13/08/2012,18:00,343.20,2.20,3,3,1,1,5100.0,331
13/08/2012,18:10,345.20,1.30,3,3,1,1,5880.0,331
13/08/2012,18:20,340.70,1.60,3,3,1,1,6840.0,331
...

finally i want to display the three-number -index in the last collumn in a gnuplot-graf. Depending from the first number should be chosen a different color. Herefor i want do prepare the data with the awk-script, so that i achive ...
Code:
...
08/13/12-16:30  \t      \t      322    \t
08/13/12-16:40  \t      \t      \t      422
08/13/12-16:50  \t      \t      \t      \t
08/13/12-17:00  \t      \t      321    \t
08/13/12-17:10  \t      \t      \t      \t  
08/13/12-17:20  \t      \t      \t      \t  
08/13/12-17:30  \t      \t      322    \t
08/13/12-17:40  \t      \t      \t      \t  
08/13/12-17:50  \t      \t      321    \t
08/13/12-18:00  131    \t       \t      \t
08/13/12-18:10  \t      \t      322    \t
08/13/12-18:20  \t      231    \t      \t
...

2. Relevant commands, code, scripts, algorithms:
here a awk-script changes the date and time-format
Code:
cd $home
        if test -f "$f" 
        then
        awk '{FS=",";  print  substr($1, 4, 2) "/" substr($1, 1, 2) "/" substr($1, 9, 2) "-" $2  "\t" $10}' $home/$f > index.txt
        else
        DAT=$(date +%Y%m%d)
        printf "$DAT - Datei $i nicht vorhanden" >> "$home/Logfiles/Log_$(date +%Y%m%d).log"
    fi

Code:
08/13/12-16:30  322
08/13/12-16:40  422
08/13/12-16:50  
08/13/12-17:00  321
08/13/12-17:10  
08/13/12-17:20  
08/13/12-17:30  332
08/13/12-17:40  
08/13/12-17:50  331
08/13/12-18:00  131
08/13/12-18:10  331
08/13/12-18:20  231

3. The attempts at a solution (include all code and scripts):
i tried to upgrade the script to seperate the triplet in column 3
Code:
#!/bin/bash
f=APS-Index_20120101_20120814.dat
home=$HOME/Desktop/Bachelor_Projekt/APS3321/Messdaten/APS_AUSWERTUNG_TRIPLET-Daten
cd $home
	if test -f "$f" 
	then
	awk -F"," '{ 
                      if($10 > 400)
                          print  substr($1, 4, 2) "/" substr($1, 1, 2) "/" substr($1, 9, 2) "-" $2  "\t\t\t" $10       $home/$f > index.txt;
	              else if ($10 < 400 && $10 > 300)
                          print  substr($1, 4, 2) "/" substr($1, 1, 2) "/" substr($1, 9, 2) "-" $2  "\t\t" $10 "\t"    $home/$f > index.txt;
                      else if ($10 < 300 && $10 > 200)  
                         print  substr($1, 4, 2) "/" substr($1, 1, 2) "/" substr($1, 9,  2) "-" $2  "\t" $10 "\t\t"    $home/$f > index.txt;
	              else if ($10 < 200 && $10 > 100)
                         print  substr($1, 4, 2) "/" substr($1, 1, 2) "/" substr($1, 9, 2) "-" $2  "\t" $10 "\t\t\t"   $home/$f > index.txt;
	else
	print  substr($1, 4, 2) "/" substr($1, 1, 2) "/" substr($1, 9, 2) "-" $2  ""   $home/$f > index.txt;
	}'
	else
	DAT=$(date +%Y%m%d)
        printf "$DAT - Datei $i nicht vorhanden" >> "$home/Logfiles/Log_$(date +%Y%m%d).log"
    fi

I tried to do it with the above named awk if-statement, but it will not work like this.
Can someone please help me to do it the right way.
Thanks in advance!
IMPe

4. Complete Name of School (University), City (State), Country, Name of Professor, and Course Number (Link to Course):
University of Applied Sciences Aachen, Germany, Prof.Dr.Helsper, Bachelor Thesis

Note: Without school/professor/course information, you will be banned if you post here! You must complete the entire template (not just parts of it).
This User Gave Thanks to IMPe For This Post:
# 2  
Where do the 422, 231, 131 come from? They're not in your input data...

If I understand you correctly -- and I'm not sure I do, given that discrepancy -- you can use a trick to do what you want.

awk doesn't just let you read columns -- it lets you assign columns, too. The output separator is controlled by the OFS variable, and can be different from the input.

So you can do something like this:

Code:
awk -v OFS="\t" '{ ...
        $0="date and timestamp" # Set entire line to this value, overwriting $1 $2 ... variables too
        N=5 # Fifth column
        $N="value" # Set Nth colum to "value"
        print # print entire buffer
        }' file

to set the fifth, fourth, or whatever column to 'value' depending on which number N is set to. It will insert the right number of tabs in front as appropriate.

Last edited by Corona688; 08-21-2012 at 04:19 PM..
This User Gave Thanks to Corona688 For This Post:
# 3  
sorry, i've forgotten to mention this - the last three numbers in column $10 could be numbers from 111 up to 433. These triplets i use to classify several data conditions.
# 4  
What relation does your output have to your input then? If your output doesn't match your input, we're only guessing.

My guess is, 122 would end up in the second column, 233 would be in the third, 344 would be in the fourth, and 455 would be in the fifth. But I can't really tell for sure.
This User Gave Thanks to Corona688 For This Post:
# 5  
Quote:
Originally Posted by Corona688
What relation does your output have to your input then? If your output doesn't match your input, we're only guessing.

My guess is, 122 would end up in the second column, 233 would be in the third, 344 would be in the fourth, and 455 would be in the fifth. But I can't really tell for sure.
Hi corona688!
First of all, thank you for helping me to solving my problem..
the triplets should be sortet by there first number acording to the timestamp.
I would like the triplets printout in a gnuplot-grafik. there should be the 4**-Triplets in red, the 3** in green and so on. therefor i need the data-triplets in this strukture.
Code:
...
08/13/12-16:30  \t      \t      322    \t
08/13/12-16:40  \t      \t      \t      422
08/13/12-16:50  \t      \t      \t      \t
08/13/12-17:00  \t      \t      321    \t
08/13/12-17:10  \t      \t      \t      \t  
08/13/12-17:20  \t      \t      \t      \t  
08/13/12-17:30  \t      \t      322    \t
08/13/12-17:40  \t      \t      \t      \t  
08/13/12-17:50  \t      \t      321    \t
08/13/12-18:00  131    \t       \t      \t
08/13/12-18:10  \t      \t      322    \t
08/13/12-18:20  \t      231    \t      \t
...

The main problem i think, is the if-structure. I'm not quit famuiliar with this part in the awk. I think the script start struggeling by defining the Fieldseperator.

Thanks
IMPe
# 6  
You haven't answered my question. What relation does your input have with your output? Does the column it ends up in, depend on the first digit of the triplet?

That is what I am assuming since you refuse to say, but in all probability I have guessed wrong...

You are making it way more complicated than you need to with if/else if/else if/else if. You are also making it way more complicated than you need to with all those substr things. Just use columns to get the data you want and assign it where you wish. N=5; $N="abcd" sets column 5 to abcd for instance without needing 5 if/else if/else if/else if's to count up to 5. The number could be 3, 7, or 99 and it would work the same way.

Code:
awk -F"," -v OFS="\t" '{
        L=$10
        N=substr(L, 1, 1)+1;
        $0=$1"-"$2;
        if(L) $N=L } 1' input > output

This User Gave Thanks to Corona688 For This Post:
# 7  
Hi

Unfortunatelly i found out after all, that gnuplot do have problems displaying data in the created structure.
Code:
...
08/13/12-16:30  \t      \t      322    \t
08/13/12-16:40  \t      \t      \t      422
08/13/12-16:50  \t      \t      \t      \t
08/13/12-17:00  \t      \t      321    \t
...

it has to be an "-" on every missing number.
Code:
...
08/13/12-16:30  -      -      322   -
08/13/12-16:40  -      -      -      422
08/13/12-16:50  -      -      -      -
08/13/12-17:00  -      -      321   -
...

The OFS is still "\t". I try to do it on the based script, but i break down on it. CAn you please give me one more hint.

Thanks in advance!
IMPe

Last edited by IMPe; 08-21-2012 at 08:18 PM..
Login or Register for Dates, Times and to Reply

Previous Thread | Next Thread
Thread Tools Search this Thread
Search this Thread:
Advanced Search

Test Your Knowledge in Computers #430
Difficulty: Medium
Variables in JavaScript can be defined using either the var, let, static or const keywords.
True or False?

10 More Discussions You Might Find Interesting

1. Shell Programming and Scripting

Need help Filtering Data from an API

Hi Everyone, I need help on figuring out a way to filter some data that I get back from an API. Im able to get all the data that Im looking for but I would like to know a way for me to filter it better. The data that Im getting back is basically 2 rows of data as seen here. Row 1 ... (25 Replies)
Discussion started by: TheStruggle
25 Replies

2. Shell Programming and Scripting

Filtering out the data with dates

Hi, I have some data like seen below. format : apple(hhmm mm/dd).fruit apple(2345 03/25).fruit apple(2345 05/06).fruit orange(0443 05/02).fruit orange(0345 05/05).fruit orange(2134 05/04).fruit grape(0930 04/24).fruit grape(2330 03/30).fruit I need to get the data which are... (1 Reply)
Discussion started by: jayadanabalan
1 Replies

3. Shell Programming and Scripting

awk - filtering data by if --> into an array

Hi my data is in csv-format ... ... 13/08/2012,16:30,303.30,5.10,3,2,2,1,9360.0,322 13/08/2012,16:40,305.50,5.00,3,2,2,1,12360.0,322 13/08/2012,16:50,319.90,3.80,3,2,1,0,2280.0 13/08/2012,17:00,326.10,3.50,3,2,1,1,4380.0,321 13/08/2012,17:10,333.00,3.80,3,3,1,0,2280.0... (1 Reply)
Discussion started by: IMPe
1 Replies

4. Shell Programming and Scripting

awk data filtering

I am trying to filter out some data with awk. If someone could help me that would be great. Below is my input file. Date: 10-JUN-12 12:00:00 B 0: 00 00 00 00 10 00 16 28 B 120: 00 00 00 39 53 32 86 29 Date: 10-JUN-12 12:00:10 B 0: 00 00 00 00 10 01 11 22 B 120: 00 00 00 29 23 32 16 29... (5 Replies)
Discussion started by: thibodc
5 Replies

5. Shell Programming and Scripting

Filtering data using AWK

Hi , i have file with delimiter as "|" and data in Double codes for all fields. how to filter data in a column like awk -F"|" '$1="asdf" {print $0}' test. ex : "asdf"|"zxcv" Thanks, Soma (1 Reply)
Discussion started by: challamsomu
1 Replies

6. Shell Programming and Scripting

AWK help. how to compare a variable with a data array in AWK?

Hi all, i have a data array as follows. array=ertfgj2345 array=456ttygkd . . . array=errdjt3235 so number or elements in the array can varies depending on how big the data input is. now i have a variable, and it is $1 (there are $2, $3 and so on, i am only interested in $1). ... (9 Replies)
Discussion started by: usustarr
9 Replies

7. Shell Programming and Scripting

help need in filtering data

Hello Gurus, Please help me out of the problem. I ve a input file as below input clock; input a; //reset all input b; //input comment output c; output d; output e; input f; //output comment I need the output as follows: \\Inputs (1 Reply)
Discussion started by: user_prady
1 Replies

8. UNIX for Dummies Questions & Answers

Filtering Data

file1 contain: (this just a small sample of data it may have thousand of lines) 1 aaa 1/01/1975 delhi 2 bbb 2/03/1977 mumbai 3 ccc 1/01/1975 mumbai 4 ddd 2/03/1977 chennai 5 aaa 1/01/1975 kolkatta 6 bbb 2/03/1977 bangalore program: nawk '{ idx= $2 SUBSEP $3 arr = (idx in arr) ?... (2 Replies)
Discussion started by: bobo
2 Replies

9. Shell Programming and Scripting

Filtering Data

Hi All, I have the below input and expected ouput. I need a code which can scan through this input file and if the number in column1 is more than 1 , it will print out the whole line, else it will output "No Re-occurrence". Can anybody help ? Input: 1 vvvvv 20 7 7 23 0 64 6 zzzzzz 11 5... (7 Replies)
Discussion started by: Raynon
7 Replies

10. UNIX for Dummies Questions & Answers

Filtering out data ...

I have following command which tells me File size in GBs which are greater than 0.01GBs recursively in a dir structure. ls -l -R | awk '{ if ($5/1073741824 >= 0.01) print $9, $5/1073741824 }' But there are some files whom I dont have enough permissions, after executing this script gives me... (1 Reply)
Discussion started by: videsh77
1 Replies

Featured Tech Videos