Sponsored Content
Top Forums UNIX for Dummies Questions & Answers Data analysis, Regular Expression - Unix Post 302668947 by @man on Tuesday 10th of July 2012 09:02:28 AM
Old 07-10-2012
Question if statement? Bash scripting :|

Hey every one!
I have a dataset like this :
Code:
1	100	1	0
5	100	1	8
7	50	1	0
7	100	2	0
10	20	1	8
10	30	1	8
10	100	3	8
15	50	5	0
20	90	1	0
20	99	9	0

I wanna check if the 4th column is 0 or 8
If it's zero write the 1st column itself, if it's 8 write sum of 1st and second
something like this:

Code:
1	1	0
105	1	8
7	1	0
7	2	0
30	1	8
40	1	8
110	3	8
15	5	0
20	1	0
20	9	0

Then after sorting based on first column write the first column and sum of numbers for the same values in 1st column in 2nd one to the output file. Like this:

Code:
1	1	0
7	3	0
15	5	0
20	10	0
30	1	8
40	1	8
105	1	8
110	3	8

At the end I wanna split the file to two files regarding the value in the 3rd column.

If you can help me with this, you saved my life! Smilie

Last edited by @man; 07-10-2012 at 10:51 AM..
 

10 More Discussions You Might Find Interesting

1. UNIX for Dummies Questions & Answers

Syntax Help | unix | grep | regular expression | repetition

Hello, This is my first post so, Hello World! Anyways, I'm learning how to use unix and its quickly become apparent that a strong foundation in regular expressions will make things easier. I'm not sure if my syntax is messing things up or my logic is messing things up. ps -e | grep... (4 Replies)
Discussion started by: MykC
4 Replies

2. UNIX for Dummies Questions & Answers

HELP | unix | regular expression - How to represent two whitespaces?

ls -l generates something like... -rwxr-xr-x 1 root root 19304 Jan 21 2009 true -rwsr-xr-x 1 root root 40208 Jan 21 2009 umount -rwxr-xr-x 1 root root 22336 Jan 21 2009 uname -rwxr-xr-x 1 root root 1273 Jan 21 2009 unicode_start -rwxr-xr-x 1 root root 79 Jan 21 2009... (3 Replies)
Discussion started by: MykC
3 Replies

3. UNIX for Dummies Questions & Answers

Help | unix | grep | regular expression

I have the following code: ls -al /bin | tr -s ' ' | grep 'x' ls -al: Lists all the files in a given director such as /bin tr -s ' ': removes additional spaces between characters so that there is only one space grep 'x': match all "x" characters that are followed by a whitespace. I was... (3 Replies)
Discussion started by: MykC
3 Replies

4. UNIX for Dummies Questions & Answers

Help | Unix | grep | regular expression | backreference | Syntax/Logic

Hello, I'm working on learning regular expressions and what I can do with them. I'm using unix to and its programs to experiment and learn what my limitations are with them. I'm working on duplicating the regular expression: ^(.*)(\r?\n\1)+$ This is supposed to delete duplicate lines... (2 Replies)
Discussion started by: MykC
2 Replies

5. Programming

Perl: How to read from a file, do regular expression and then replace the found regular expression

Hi all, How am I read a file, find the match regular expression and overwrite to the same files. open DESTINATION_FILE, "<tmptravl.dat" or die "tmptravl.dat"; open NEW_DESTINATION_FILE, ">new_tmptravl.dat" or die "new_tmptravl.dat"; while (<DESTINATION_FILE>) { # print... (1 Reply)
Discussion started by: jessy83
1 Replies

6. Shell Programming and Scripting

regular expression with shell script to extract data out of a text file

hi i am trying to extract some specific data out of a text file using regular expressions with shell script that is using a multiline grep .. and the tool i am using is pcregrep so that i can get compatibility with perl's regular expressions for a sample data like this, i am trying to grab... (6 Replies)
Discussion started by: vemkiran
6 Replies

7. Shell Programming and Scripting

PERL: retrieve the data based on regular expression

Hi Friends i have a code below sample $text contains the values test1 PIC X test1 PIC XX test1 PIC XXX test1 PIC X(8) test1 PIC X(12) test1 PIC X test1 X(8) test1 PIC X VALUE 'N'. $text =~ /^\d{6} +(\d{2}) +(+) +PIC +(+)(\((\d+)\)(V(+)| +(COMP\-3).|\.)|( +(COMP\-3).|... (4 Replies)
Discussion started by: i150371485
4 Replies

8. Shell Programming and Scripting

Pattern search (regular expression in UNIX)

Hello , Could anyone help me to define the string in regular expression way . Below is my string \rtf1\ansi\deff0{\fonttbl{\f0\fswiss Helv;}{\f1\fnil MS Sans Serif;}} {\colortbl ;\red0\green0\blue0;} \viewkind4\uc1\pard\cf1\lang1033\f0\fs16 The string will always start as \rtf1 and... (6 Replies)
Discussion started by: Pratik4891
6 Replies

9. Shell Programming and Scripting

Regular expression in UNIX

How can I define a regular expression of a string which can start with Capital alphabet or integer (A-Z) or (0-9) and can be of any number of characters I have tried * but its not working could anyone please suggest? (2 Replies)
Discussion started by: Pratik4891
2 Replies

10. UNIX for Beginners Questions & Answers

SHELL: UNIX : Ls regular expression not working when used with variables

If i do below command in unix prompt which static values (ie 27..97), it is working fine and gives desired output >ls -d $WORKDIR/batch/somefilename_{27..97}.* 2>/dev/null somefilename_27.sometxt somefilename_28.sometxt somefilename_29.sometxt .. somefilename_97.sometxt But if i want... (2 Replies)
Discussion started by: haiderali
2 Replies
deckanji(5)							File Formats Manual						       deckanji(5)

NAME
deckanji - A character encoding system (codeset) for Japanese DESCRIPTION
The DEC Kanji codeset consists of the following characters: ASCII or JIS X0201 Roman letters Katakana characters specified by JIS X0201 JIS X0208 characters User-defined characters (UDC) DEC Kanji uses a combination of single-byte data and two-byte data to represent these characters. JIS X0201 is a single-byte character set and consists of Roman letters and Katakana characters, which are Japanese phonetic symbols. The Roman letters defined in the JIS X0201-1976 standard are the same as ASCII letters. (For information on the ASCII character set, refer to ascii(5).) However, JIS X0201 replaces the ASCII tilde (~) character with a horizontal bar (located at the upper part of a character cell), and replaces the backslash () with the Japanese currency sign (Yen). The JIS XO208 standard specifies two-byte character values that represent a variety of characters, including ideographic symbols. DEC Kanji Encoding All ASCII characters are represented by single-byte 7-bit values in DEC Kanji. That is, the most significant bit (MSB) is always set off in the byte that represents an ASCII character. The Roman letters and the Katakana characters specified by JIS X0201 are also single-byte val- ues in which the most significant bit (MSB) is set off and on, respectively. The code table for JIS X0208 characters is divided into 94 rows, numbered from 1 to 94. Each row has 94 columns, also numbered from 1 to 94. JIS X0208 defines a total of 6877 characters which include: Special symbols, in rows 1 and 2 Numerals and Roman letters, in row 3 Hira- gana characters, in row 4 Katakana characters, in row 5 Greek letters, in row 6 Russian letters, in row 7 Symbols for drawing graphs, dia- grams, and lines, in row 8 First level Kanji characters, in rows 16 to 47 Second level Kanji characters, in rows 48 to 84 To comply with the JIS X0208 standard, each JIS X0208 character is a two-byte value in the DEC Kanji codeset. The MSB of both the first and second bytes is always set on to distinguish JIS X0208 characters from ASCII/JIS Roman or user-defined characters. For each JIS X0208 character, The first byte of the two-byte value determines the row number and the second determines the column number in the JIS X0208 code table. The following formula shows the code value for a JIS X0208 character in relation to its row and column numbers: 1st byte = A0 + Row number 2nd byte = A0 + Column number For example, if a character is positioned at the first column of the 36th row, its code value is C4A1, which is calculated as follows: 1st byte = A0 (hex) + 36 = C4 (hex) 2nd byte = A0 (hex) + 01 = A1 (hex) For user-defined character (UDC) definitions, DEC Kanji provides an area of 2914 positions (from row 1 to row 31). Each UDC is represented by a two-byte value, just like a JIS X0208 character value. However, the MSB of the second byte of a UDC is set off to distinguish it from a JIS X0208 character. The code range of the UDC area is A121 to BF7E. The following formula calculates the code of a UDC in relation to its row and column numbers: 1st byte = A0 + Row number 2nd byte = 20 + Column number For example, if a UDC is positioned at the first column of the 16th row, its code value is B021, which is calculated as follows: 1st byte = A0 (hex) + 16 = B0 (hex) 2nd byte = 20 (hex) + 01 = 21 (hex) Codeset Conversion The following codeset converter pairs are available for converting Japanese characters between deckanji and other encoding formats. Refer to iconv_intro(5) for an introduction to codeset conversion. For more information about the other codeset for which deckanji is the input or output, see the reference page specified in the list item. eucJP_deckanji, deckanji_eucJP Converting from and to Japanese Extended UNIX Code: eucJP(5). iso-2022-jp_deckanji, deckanji_iso-2022-jp Converting from and to the ISO 2022-JP codeset: iso2022jp(5). iso-2022-jpext_deckanji, deckanji_iso-2022-jpext Converting from and to the ISO 2022-JPexp codeset: iso2022jp(5). JIS7_deckanji, deckanji_JIS7 Converting from and to the JIS7 codeset: jiskanji(5). sdeckanji_deckanji, deckanji_sdeckanji Converting from and to the Super DEC Kanji codeset: sdeckanji(5). SJIS_deckanji, deckanji_SJIS Converting from and to the Shift JIS codeset: shiftjis(5). Note that Shift JIS encoding is equivalent to the Microsoft code-page format used on PCs for Japanese. Therefore, you can use these converters to convert data between DEC Kanji and PC code-page format. For information on PC code pages, see code_page(5) UCS-2_deck- anji, deckanji_UCS-2 Converting from and to UCS-2 format: Unicode(5). UCS-4_deckanji, deckanji_UCS-4 Converting from and to UCS-4 format: Unicode(5). UTF-8_deckanji, deckanji_UTF-8 Converting from and to UTF--8 format: Unicode(5). Japanese Fonts The operating system provides the following Japanese bitmap fonts in various sizes and typefaces for 75dpi and 100dpi (dot-per-inch) dis- play devices: JIS X0201-1976 characters (Gothic family): -jdecw-gothic-medium-r-normal--8-80-75-75-m-40-jisx0201.1976-0 -jdecw-gothic-medium-r-normal--14-140-75-75-m-70-jisx0201.1976-0 -jdecw-gothic-medium-r-normal--12-120-75-75-m-60-jisx0201.1976-0 -jdecw-gothic-medium-r-normal--24-240-75-75-m-120-jisx0201.1976-0 -jdecw-gothic-medium-r-normal--10-100-75-75-m-50-jisx0201.1976-0 -jdecw-gothic-medium-r-normal--18-180-75-75-m-90-jisx0201.1976-0 -jdecw-gothic-medium-r-normal--17-120-100-100-m-85-jisx0201.1976-0 -jdecw-gothic-medium-r-nor- mal--34-240-100-100-m-170-jisx0201.1976-0 -jdecw-gothic-medium-r-normal--14-100-100-100-m-70-jisx0201.1976-0 -jdecw-gothic-medium-r- normal--25-180-100-100-m-125-jisx0201.1976-0 -jdecw-gothic-medium-r-normal--20-140-100-100-m-100-jisx0201.1976-0 -jdecw-gothic- medium-r-normal--11-80-100-100-m-55-jisx0201.1976-0 JIS X0201-1976 characters (Kmenu family) -jdecw-kmenu-medium-r-normal--12-120-75-75-p-70-jisx0201.1976-0 -jdecw-kmenu-medium-r-normal--17-120-100-100-p-85-jisx0201.1976-0 JIS X0201-1976 characters (Mincho family) -jdecw-mincho-medium-r-normal--8-80-75-75-m-40-jisx0201.1976-0 -jdecw-mincho-medium-r-normal--14-140-75-75-m-70-jisx0201.1976-0 -jdecw-mincho-medium-r-normal--24-240-75-75-m-120-jisx0201.1976-0 -jdecw-mincho-medium-r-normal--10-100-75-75-m-50-jisx0201.1976-0 -jdecw-mincho-medium-r-normal--18-180-75-75-m-90-jisx0201.1976-0 -jdecw-mincho-medium-r-normal--17-120-100-100-m-85-jisx0201.1976-0 -jdecw-mincho-medium-r-normal--34-240-100-100-m-170-jisx0201.1976-0 -jdecw-mincho-medium-r-nor- mal--14-100-100-100-m-70-jisx0201.1976-0 -jdecw-mincho-medium-r-normal--25-180-100-100-m-125-jisx0201.1976-0 -jdecw-mincho-medium-r- normal--20-140-100-100-m-100-jisx0201.1976-0 -jdecw-mincho-medium-r-normal--11-80-100-100-m-55-jisx0201.1976-0 JIS X0201-1976 char- acters (Screen family) -jdecw-screen-medium-r-normal--24-240-75-75-m-120-jisx0201-romankana -jdecw-screen-medium-r-nor- mal--18-180-75-75-m-80-jisx0201-romankana -jdecw-screen-medium-r-normal--14-140-75-75-m-70-jisx0201-romankana -jdecw-screen-medium- r-normal--10-100-75-75-m-50-jisx0201-romankana JIS X0208 characters (Gothic family) -jdecw-gothic-medium-r-normal--14-140-75-75-m-140-jisx0208.1983-1 -jdecw-gothic-medium-r-normal--12-120-75-75-m-120-jisx0208.1983-1 -jdecw-gothic-medium-r-normal--24-240-75-75-m-240-jisx0208.1983-1 -jdecw-gothic-medium-r-normal--10-100-75-75-m-100-jisx0208.1983-1 -jdecw-gothic-medium-r-normal--18-180-75-75-m-180-jisx0208.1983-1 -jdecw-gothic-medium-r-normal--8-80-75-75-m-80-jisx0208.1983-1 -jdecw-gothic-medium-r-normal--17-120-100-100-m-170-jisx0208.1983-1 -jdecw-gothic-medium-r-nor- mal--34-240-100-100-m-340-jisx0208.1983-1 -jdecw-gothic-medium-r-normal--14-100-100-100-m-140-jisx0208.1983-1 -jdecw-gothic-medium- r-normal--25-180-100-100-m-250-jisx0208.1983-1 -jdecw-gothic-medium-r-normal--20-140-100-100-m-200-jisx0208.1983-1 -jdecw-gothic- medium-r-normal--11-80-100-100-m-110-jisx0208.1983-1 JIS X0208 characters (Mincho family) -jdecw-mincho-medium-r-normal--14-140-75-75-m-140-jisx0208.1983-1 jdecw-mincho-medium-r-normal--12-120-75-75-m-120-jisx0208.1983-1 -jdecw-mincho-medium-r-normal--24-240-75-75-m-240-jisx0208.1983-1 -jdecw-mincho-medium-r-normal--10-100-75-75-m-100-jisx0208.1983-1 -jdecw-mincho-medium-r-normal--18-180-75-75-m-180-jisx0208.1983-1 -jdecw-mincho-medium-r-normal--8-80-75-75-m-80-jisx0208.1983-1 -jdecw-mincho-medium-r-normal--17-120-100-100-m-170-jisx0208.1983-1 -jdecw-mincho-medium-r-nor- mal--34-240-100-100-m-340-jisx0208.1983-1 -jdecw-mincho-medium-r-normal--14-100-100-100-m-140-jisx0208.1983-1 -jdecw-mincho-medium- r-normal--25-180-100-100-m-250-jisx0208.1983-1 -jdecw-mincho-medium-r-normal--20-140-100-100-m-200-jisx0208.1983-1 -jdecw-mincho- medium-r-normal--11-80-100-100-m-110-jisx0208.1983-1 JIS X0208 characters (Screen family) -jdecw-screen-medium-r-normal--24-240-75-75-m-240-jisx0208-kanji00 -jdecw-screen-medium-r-nor- mal--10-100-75-75-m-100-jisx0208-kanji00 -jdecw-screen-medium-r-normal--18-180-75-75-m-160-jisx0208-kanji00 -jdecw-screen-medium-r- normal--16-160-75-75-m-160-jisx0208-kanji00 -jdecw-screen-medium-r-normal--14-140-75-75-m-140-jisx0208-kanji00 -jdecw-screen-medium- r-normal--24-240-75-75-m-240-jisx0208-kanji11 -jdecw-screen-medium-r-normal--10-100-75-75-m-100-jisx0208-kanji11 -jdecw-screen- medium-r-normal--18-180-75-75-m-160-jisx0208-kanji11 -jdecw-screen-medium-r-normal--14-140-75-75-m-140-jisx0208-kanji11 For printers, the operating system provides only Japanese fonts that are printer-resident; that is, there are no Japanese fonts that can be dynamically down-loaded to the printer. Refer to i18n_printing(5) for general information on printing non-English text. SEE ALSO
Commands: locale(1) Others: ascii(5), code_page(5), eucJP(5), i18n_intro(5), i18n_printing(5), iconv_intro(5), iso2022jp(5), Japanese(5), jiskanji(5), sdeck- anji(5), shiftjis(5), Unicode(5) deckanji(5)
All times are GMT -4. The time now is 01:00 PM.
Unix & Linux Forums Content Copyright 1993-2022. All Rights Reserved.
Privacy Policy