Sponsored Content
Top Forums Shell Programming and Scripting Data filtering and category assigning Post 302927791 by jianp83 on Friday 5th of December 2014 01:22:14 PM
Old 12-05-2014
Data filtering and category assigning

Please consider the following file, I have many groups which can be of 3 types, T1 (Serial_Number 1) T2 (Serial_Number 2) and T1*T2 (all other Serial_Number).

I want to only consider groups that have both T1 and T2 present and their values are different from each other. In the example file, Group3 and Group 5 are not to be considered for the same reasons.
Important to mention that the data is not sorted, so T1, T2 and T1*T2 rows are scattered in the file, in no particular order.



Code:
Group	Type	Value	Serial_number
Group1	T1	aa	1
Group1	T2	tt	2
Group1	T1*T2	at	3
Group1	T1*T2	tt	4
Group2	T1	gg	1
Group2	T2	tt	2
Group2	T1*T2	gg	3
Group2	T1*T2	tt	4
Group2	T1*T2	gt	5
Group3	T1	gg	1
Group3	T2	gg	2
Group3	T1*T2	gg	3
Group3	T1*T2	gg	5
Group4	T1	gg	1
Group4	T2	tt	2
Group4	T1*T2	gt	4
Group4	T1*T2	gg	5
Group5	T1	gg	1
Group5	T1*T2	gt	5


I want to add a column to the output , only for types T1*T2 that states if they match the corrsponding value of T1 in the group, or T2 in the group or doesnt match any of T1 or T2.


For example for Group1, the value of T1*T2 (Serial_number 3) is 'at' which
doesnt match its T1 value of 'aa' or T2 value of 'tt'. So it is 'different'

For Group1, the value of T1*T2 (Serial_number 4) is 'tt' which matches T2 value of 'tt' , so it assigned 'T2-like'

Code:
Group	Type	Value	Serial_number	Similar_To
Group1	T1*T2	at	3	different
Group1	T1*T2	tt	4	T2-like
Group2	T1*T2	gg	3	T1-like
Group2	T1*T2	tt	4	T2-like
Group2	T1*T2	gt	5	different
Group4	T1*T2	gt	4	different
Group4	T1*T2	gg	5	T1-like

This is my feeble attempt, which doesn't work.

Code:
awk 
' {
   if(!($1 in grp)) {
      grp[$1]++
      type[$1]=$2
      val[$1,1]=$3 FS $4
      next
   }
  NR != 1 {
 grp[$1]++
 type[$2]++
 val[$3]++
 a[$1,$2]=a[$1,$2]" "$3
 
 if($3=="T1") categ="T1-like"
 else if ($3=="P2") categ="T2-like"
 else categ="different"
 
 if($3="T1*T2")
   for (i=1;i<length(grp);i++)
   	if (grp[i]==grp[i-1])
   		catg1[i]=categ
   print $1 FS $2 FS $3 FS $4 FS catg1[i]
 
 }' infile


Last edited by jianp83; 12-05-2014 at 02:31 PM..
 

10 More Discussions You Might Find Interesting

1. UNIX for Dummies Questions & Answers

Filtering out data ...

I have following command which tells me File size in GBs which are greater than 0.01GBs recursively in a dir structure. ls -l -R | awk '{ if ($5/1073741824 >= 0.01) print $9, $5/1073741824 }' But there are some files whom I dont have enough permissions, after executing this script gives me... (1 Reply)
Discussion started by: videsh77
1 Replies

2. Shell Programming and Scripting

Filtering Data

Hi All, I have the below input and expected ouput. I need a code which can scan through this input file and if the number in column1 is more than 1 , it will print out the whole line, else it will output "No Re-occurrence". Can anybody help ? Input: 1 vvvvv 20 7 7 23 0 64 6 zzzzzz 11 5... (7 Replies)
Discussion started by: Raynon
7 Replies

3. UNIX for Dummies Questions & Answers

Filtering Data

file1 contain: (this just a small sample of data it may have thousand of lines) 1 aaa 1/01/1975 delhi 2 bbb 2/03/1977 mumbai 3 ccc 1/01/1975 mumbai 4 ddd 2/03/1977 chennai 5 aaa 1/01/1975 kolkatta 6 bbb 2/03/1977 bangalore program: nawk '{ idx= $2 SUBSEP $3 arr = (idx in arr) ?... (2 Replies)
Discussion started by: bobo
2 Replies

4. Shell Programming and Scripting

help need in filtering data

Hello Gurus, Please help me out of the problem. I ve a input file as below input clock; input a; //reset all input b; //input comment output c; output d; output e; input f; //output comment I need the output as follows: \\Inputs (1 Reply)
Discussion started by: user_prady
1 Replies

5. Shell Programming and Scripting

Parsing out the first (top) data lines of each category

Hi All, I need some help in parsing out the first (top) data lines of each category (categories are based on the first column a, b, c, d, e.( see example file below) from a big file a dfg 3 6 8 9 a fgh 5 7 0 9 a gkl 5 2 4 7 a glo 7 0 1 5 b ghj 9 0 4 2 b mkl 7 8 0 5 b jkl 9 0 4 5 c jkl 2... (1 Reply)
Discussion started by: Lucky Ali
1 Replies

6. Shell Programming and Scripting

Filtering data using AWK

Hi , i have file with delimiter as "|" and data in Double codes for all fields. how to filter data in a column like awk -F"|" '$1="asdf" {print $0}' test. ex : "asdf"|"zxcv" Thanks, Soma (1 Reply)
Discussion started by: challamsomu
1 Replies

7. Shell Programming and Scripting

awk data filtering

I am trying to filter out some data with awk. If someone could help me that would be great. Below is my input file. Date: 10-JUN-12 12:00:00 B 0: 00 00 00 00 10 00 16 28 B 120: 00 00 00 39 53 32 86 29 Date: 10-JUN-12 12:00:10 B 0: 00 00 00 00 10 01 11 22 B 120: 00 00 00 29 23 32 16 29... (5 Replies)
Discussion started by: thibodc
5 Replies

8. Shell Programming and Scripting

Filtering out the data with dates

Hi, I have some data like seen below. format : apple(hhmm mm/dd).fruit apple(2345 03/25).fruit apple(2345 05/06).fruit orange(0443 05/02).fruit orange(0345 05/05).fruit orange(2134 05/04).fruit grape(0930 04/24).fruit grape(2330 03/30).fruit I need to get the data which are... (1 Reply)
Discussion started by: jayadanabalan
1 Replies

9. Shell Programming and Scripting

Need help Filtering Data from an API

Hi Everyone, I need help on figuring out a way to filter some data that I get back from an API. Im able to get all the data that Im looking for but I would like to know a way for me to filter it better. The data that Im getting back is basically 2 rows of data as seen here. Row 1 ... (25 Replies)
Discussion started by: TheStruggle
25 Replies

10. Shell Programming and Scripting

Inserting column data based on category assignment

please help with the following. I have 4 col data .. instrument , category, variable and value. the instruments belong to particular categories and they all measure some variables (var1 and var2 in this example), the last column is the value an instrument outputs for a variable. I have used... (0 Replies)
Discussion started by: ritakadm
0 Replies
KEYBOARD(5)						    Console-setup User's Manual 					       KEYBOARD(5)

NAME
keyboard - keyboard configuration file DESCRIPTION
The keyboard file describes the properties of the keyboard. It is read by setupcon(1) in order to configure the keyboard on the console. In Debian systems the default keyboard layout is described in /etc/default/keyboard and it is shared between X and the console. The specification of the keyboard layout in the keyboard file is based on the XKB options XkbModel, XkbLayout, XkbVariant and XkbOptions. Unfortunately, there is little documentation how to use them. Description of all possible values for these options can be found in the file xorg.lst. You might want to read "The XKB Configuration Guide" by Kamil Toman and Ivan U. Pascal: http://www.xfree86.org/current/XKB-Config.html Other possible readings are: https://wiki.archlinux.org/index.php/X_KeyBoard_extension http://pascal.tsu.ru/en/xkb/ http://www.charvolant.org/~doug/xkb/ The complete XKB-specification can be found on http://xfree86.org/current/XKBproto.pdf The file keyboard consists of variable settings in POSIX format: VARIABLE=VALUE Only one assignment is allowed per line. Comments (starting with '#') are also allowed. OPTIONS
The following variables can be set. XKBMODEL Specifies the XKB keyboard model name. Default: pc105 on most platforms. XKBLAYOUT Specifies the XKB keyboard layout name. This is usually the country or language type of the keyboard. Default: us on most platforms XKBVARIANT Specifies the XKB keyboard variant components. These can be used to further specify the keyboard layout details. Default: not set. XKBOPTIONS Specifies the XKB keyboard option components. Options usually relate to the behavior of the special keys (<Shift>, <Control>, <Alt>, <CapsLock>, etc.) Default: not set. BACKSPACE Determines the behavior of <BackSpace> and <Delete> keys on the console. Allowed values: bs, del and guess. In most cases you can specify guess here, in which case the current terminal settings and the kernel of your operating system will be used to determine the correct value. Value bs specifies VT100-conformant behavior: <BackSpace> will generate ^H (ASCII BS) and <Delete> will generate ^? (ASCII DEL). Value del specifies VT220-conformant behavior: <BackSpace> will generate ^? (ASCII DEL) and <Delete> will gener- ate a special function sequence. KMAP Usually this variable will be unset but if you don't want to use a XKB layout on the console, you can specify an alternative keymap here. Specify a file that is suitable as input for loadkeys(1) on Linux or for kbdcontrol(1) on FreeBSD. FILES
The standard location of the keyboard file is /etc/default/keyboard. Description of all available keyboard models, layouts, variants and options is available in /usr/share/X11/xkb/rules/xorg.lst. In most cases, in /usr/share/keymaps/ or /usr/share/syscons/keymaps/ you will find several keymaps that can be used with the variable KMAP. NOTES
In Debian systems, changes in /etc/default/keyboard do not become immediately visible to X. You should either reboot the system, or use udevadm trigger --subsystem-match=input --action=change In order to activate the changes on the console, run setupcon(1). BUGS
When a triple-layout is used on the console, i.e. a layout with three XKB groups, then the group toggling happens in the following way: Group1 -> Group2 -> Group1 -> Group3. On FreeBSD triple- and quadruple-layouts are not supported on the console (only the first and the second layout are taken into account). The option grp:shifts_toggle is not supported on the console. EXAMPLES
The following configuration will give you the standard US QWERTY layout (us). The key <Menu> will act as a compose key (compose:menu) and <CapsLock> will act as third control key (ctrl:nocaps). XKBLAYOUT=us XKBVARIANT= XKBOPTIONS=compose:menu,ctrl:nocaps In the following configuration the right <Alt> key (grp:toggle) will toggle between US QWERTY layout (us) and Greek (gr) layout. The op- tion grp_led:scroll is ignored on the console but in X in means to use the ScrollLock keyboard led as indicator for the current layout (US or Greek). XKBLAYOUT=us,gr XKBVARIANT= XKBOPTIONS=grp:toggle,grp_led:scroll In the following configuration the <Control>+<Shift> key combination will toggle (grp:ctrl_shift_toggle) between French keyboard (fr) with- out dead keys (nodeadkeys) and British (gb) "Dvorak" (dvorak) keyboard. The right <Win> key will be a compose-key (compose:rwin) and the right <Alt> key will function as AltGr (lv3:lalt_switch). XKBLAYOUT=fr,gb XKBVARIANT=nodeadkeys,dvorak XKBOPTIONS=grp:ctrl_shift_toggle,compose:rwin,lv3:ralt_switch SEE ALSO
setupcon(1), ckbcomp(1), console-setup(5), loadkeys(1), kbdcontrol(1) console-setup 2011-03-17 KEYBOARD(5)
All times are GMT -4. The time now is 04:24 AM.
Unix & Linux Forums Content Copyright 1993-2022. All Rights Reserved.
Privacy Policy