Sponsored Content
Top Forums Shell Programming and Scripting Extracting a portion of a data file with identifier Post 302378706 by Lucky Ali on Tuesday 8th of December 2009 03:32:44 PM
Old 12-08-2009
Thanks,
How do I run it? Is it like a shell script and save it as a .sh file and run.
Also for 'read pattern' I just have to write read GTKDH ?

Please let me know.

LS

---------- Post updated at 03:32 PM ---------- Previous update was at 02:29 PM ----------

I tried the second code (shell) and it worked very well.
But there is a problem in numbering the identifier in the output file.

For Identifier's less than 10, the program out putted the corrected number while when Identifier is 10 or greater than 10, the identifier in the output will only be 9.

This is an example of the real output data.

1 3 9 36 281 2.0e+004 ATTGCATGC
2 4 12 50 403 1.3e+005 GCATGCAAATTT
7 8 15 9 90 7.2e+008 TGCATGCAAAAATGC
9 8 7 14 103 3.4e+008 GCATGCA
9 2 7 35 293 1.4e-004 GCATGCA
9 3 11 27 225 1.5e+006 GCATGCAAAAT
9 3 9 31 273 1.8e-004 TTGCATGCA
9 7 7 9 75 4.4e+005 TGCATGC
9 1 9 21 186 4.3e-002 TGCATGCAA
9 1 19 12 165 3.9e-005 TGGCGGGAAATGCATGCAG
9 1 20 49 538 1.4e-036 TTTAAAATTGCATGCATGCA
9 6 7 17 132 1.7e+007 GCATGCA
9 4 11 14 128 2.2e+006 TGCATGCACAC
9 4 7 20 145 6.0e+008 TGCATGC
9 3 9 15 149 5.7e-001 TGCATGCAA
9 1 9 25 231 7.3e-007 GCATGCAAA
9 1 16 34 357 5.9e-014 AAATTTGCATGCAAAC
9 5 11 8 88 2.5e+004 AAATGCATGCA
9 7 7 10 86 1.6e+005 TGCATGC
9 4 9 18 150 6.7e+006 TTTGCATGC
9 1 16 45 480 4.6e-034 GCATGCATTTGGCGCC
9 3 9 45 360 3.0e+002 CTTGCATGC
9 3 9 16 150 6.3e+000 GCATGCAAA
9 5 9 8 80 2.7e+004 TGCATGCAA
9 4 9 16 157 1.5e-001 GCATGCAAA
9 1 14 32 347 1.3e-022 GTTGCATGCATGCA
9 2 9 9 89 2.0e+004 TGCATGCAC
9 3 7 14 116 1.7e+005 CGCATGC
9 1 12 21 223 1.1e-012 TTTTGCATGCAA
9 3 9 16 150 1.0e+001 TGCATGCAA
9 6 9 17 142 3.2e+007 GCATGCACA
9 4 9 6 62 2.1e+005 GCATGCAAA
9 2 9 14 144 2.0e-002 TGCATGCAA
9 3 8 15 121 1.0e+005 GCATGCAA
9 6 9 14 117 2.6e+006 TGCATGCAT
9 2 16 13 163 2.6e-005 ATTTGCATGCATTCAA
9 3 12 42 378 1.8e-007 ATATGCATGCAA
9 2 9 54 468 5.3e-012 TTGCATGCA


Please let me know how to correct this problem.

LA
 

10 More Discussions You Might Find Interesting

1. Shell Programming and Scripting

Extracting a portion of data from a very large tab delimited text file

Hi All I wanted to know how to effectively delete some columns in a large tab delimited file. I have a file that contains 5 columns and almost 100,000 rows 3456 f g t t 3456 g h 456 f h 4567 f g h z 345 f g 567 h j k lThis is a very large data file and tab delimited. I need... (2 Replies)
Discussion started by: Lucky Ali
2 Replies

2. Shell Programming and Scripting

Removing a portion of data in a file

Hi, I have a folder that contains many (multiple) files 1.fasta 2.fasta 3.fasta 4.fasta 5.fasta . . 100's of files Each such file have data in the following format for example: vi 1.fasta Code: >AB_1 MLKKPIIIGVTGGSGGGKTSVSRAILDSFPNARIAMIQHDSYYKDQSHMSFEERVKTNYDHPLAFDTDFM (6 Replies)
Discussion started by: Lucky Ali
6 Replies

3. Shell Programming and Scripting

parsing a portion of Data from a text file

Hi All, I need some help to effectively parse out a subset of results from a big results file. Below is an example of the text file. Each block that I need to parse starts with "Output of GENE for sequence file 100.fasta" (next block starts with another number). I have given the portion of... (8 Replies)
Discussion started by: Lucky Ali
8 Replies

4. Shell Programming and Scripting

Help on extracting portion of string

Hi Gurus, I've some sample of my log information as shown below. -> Processing ABCD123456 This is tp version 372.04.57 (release 700, unicode enabled) This is R3trans version 6.14 (release 700 - 05.03.09 - 08:28:00). unicode enabled version R3trans finished (0000). Warning: Parameter... (1 Reply)
Discussion started by: superHonda123
1 Replies

5. Shell Programming and Scripting

Extracting specific lines of data from a file and related lines of data based on a grep value range?

Hi, I have one file, say file 1, that has data like below where 19900107 is the date, 19900107 12 144 129 0.7380047 19900108 12 168 129 0.3149017 19900109 12 192 129 3.2766666E-02 ... (3 Replies)
Discussion started by: Wynner
3 Replies

6. UNIX for Dummies Questions & Answers

Extracting data from file

I am trying to compare the data in lines 3 & 5 to see if they match up to the '-S570' (see first code set, all proprietary information has been removed from code set) spawn telnet Trying ... Connected to CA-LOS1234-ASE-S570.cl . Escape character is '^]'. CA-LOS1234-ASE-S570 Username: ... (1 Reply)
Discussion started by: slipshft
1 Replies

7. Shell Programming and Scripting

Extracting a portion of the string and comparing

I have 2 text files say file1.txt and file2.txt . Some of the sample records for file1.txt were shown below: XXXXX12345XXXXXXX12 3456789YYYYY XXXXXXXXXX12345XX123457485YYYYY XX12345XXXXXXXXXX123454658YYYYY for file2.txt, some of the sample records were shown below: ... (5 Replies)
Discussion started by: bobby1015
5 Replies

8. Shell Programming and Scripting

error while extracting a line from a file based on identifier

here is the content of input file CREATE TABLE `bla bla bla` ( `allianceSiteId` int(11) DEFAULT NULL, `trunkGroupsId` int(11) DEFAULT NULL, `lastModified` timestamp NOT NULL DEFAULT CURRENT_TIMESTAMP ON UPDATE CURRENT_TIMESTAMP, UNIQUE KEY `allianceSiteId`... (4 Replies)
Discussion started by: vivek d r
4 Replies

9. Shell Programming and Scripting

Extracting a portion of the filename

Hi I would like to extract the first portion of filename from a list of files. The filename pattern is of the form 123456789_TEXT_TEXT_TEXT_.csv. I want to extract just the numerical portion of this filename from the list of files and then output this into another text file. K (6 Replies)
Discussion started by: kamal_p_99
6 Replies

10. UNIX for Beginners Questions & Answers

Extracting directory portion.

Dear Experts, I have some directory structure something like follows. I would like to cut portion of it. Would you please help me? I have to run this on several sql's. The directory path is dynamic. I have cut what comes after first "sql" string. Input:... (3 Replies)
Discussion started by: srikanth38
3 Replies
ascii(5)							File Formats Manual							  ascii(5)

NAME
ascii - Octal, hexadecimal, and decimal ASCII character sets DESCRIPTION
The octal character set is: 000 nul 020 dle 040 sp 060 0 100 @ 120 P 140 ` 160 p 001 soh 021 dc1 041 ! 061 1 101 A 121 Q 141 a 161 q 002 stx 022 dc2 042 " 062 2 102 B 122 R 142 b 162 r 003 etx 023 dc3 043 # 063 3 103 C 123 S 143 c 163 s 004 eot 024 dc4 044 $ 064 4 104 D 124 T 144 d 164 t 005 enq 025 nak 045 % 065 5 105 E 125 U 145 e 165 u 006 ack 026 syn 046 & 066 6 106 F 126 V 146 f 166 v 007 bel 027 etb 047 ' 067 7 107 G 127 W 147 g 167 w 010 bs 030 can 050 ( 070 8 110 H 130 X 150 h 170 x 011 ht 031 em 051 ) 071 9 111 I 131 Y 151 i 171 y 012 nl 032 sub 052 * 072 : 112 J 132 Z 152 j 172 z 013 vt 033 esc 053 + 073 ; 113 K 133 [ 153 k 173 { 014 np 034 fs 054 , 074 < 114 L 134 154 l 174 | 015 cr 035 gs 055 - 075 = 115 M 135 ] 155 m 175 } 016 so 036 rs 056 . 076 > 116 N 136 ^ 156 n 176 ~ 017 si 037 us 057 / 077 ? 117 O 137 _ 157 o 177 del The hexadecimal character set is: 00 nul 10 dle 20 sp 30 0 40 @ 50 P 60 ` 70 p 01 soh 11 dc1 21 ! 31 1 41 A 51 Q 61 a 71 q 02 stx 12 dc2 22 " 32 2 42 B 52 R 62 b 72 r 03 etx 13 dc3 23 # 33 3 43 C 53 S 63 c 73 s 04 eot 14 dc4 24 $ 34 4 44 D 54 T 64 d 74 t 05 enq 15 nak 25 % 35 5 45 E 55 U 65 e 75 u 06 ack 16 syn 26 & 36 6 46 F 56 V 66 f 76 v 07 bel 17 etb 27 ' 37 7 47 G 57 W 67 g 77 w 08 bs 18 can 28 ( 38 8 48 H 58 X 68 h 78 x 09 ht 19 em 29 ) 39 9 49 I 59 Y 69 i 79 y 0a nl 1a sub 2a * 3a : 4a J 5a Z 6a j 7a z 0b vt 1b esc 2b + 3b ; 4b K 5b [ 6b k 7b { 0c np 1c fs 2c , 3c < 4c L 5c 6c l 7c | 0d cr 1d gs 2d - 3d = 4d M 5d ] 6d m 7d } 0e so 1e rs 2e . 3e > 4e N 5e ^ 6e n 7e ~ 0f si 1f us 2f / 3f ? 4f O 5f _ 6f o 7f del The decimal character set is: 0 nul 16 dle 32 sp 48 0 64 @ 80 P 96 ` 112 p 1 soh 17 dc1 33 ! 49 1 65 A 81 Q 97 a 113 q 2 stx 18 dc2 34 " 50 2 66 B 82 R 98 b 114 r 3 etx 19 dc3 35 # 51 3 67 C 83 S 99 c 115 s 4 eot 20 dc4 36 $ 52 4 68 D 84 T 100 d 116 t 5 enq 21 nak 37 % 53 5 69 E 85 U 101 e 117 u 6 ack 22 syn 38 & 54 6 70 F 86 V 102 f 118 v 7 bel 23 etb 39 ' 55 7 71 G 87 W 103 g 119 w 8 bs 24 can 40 ( 56 8 72 H 88 X 104 h 120 x 9 ht 25 em 41 ) 57 9 73 I 89 Y 105 i 121 y 10 nl 26 sub 42 * 58 : 74 J 90 Z 106 j 122 z 11 vt 27 esc 43 + 59 ; 75 K 91 [ 107 k 123 { 12 np 28 fs 44 , 60 < 76 L 92 108 l 124 | 13 cr 29 gs 45 - 61 = 77 M 93 ] 109 m 125 } 14 so 30 rs 46 . 62 > 78 N 94 ^ 110 n 126 ~ 15 si 31 us 47 / 63 ? 79 O 95 _ 111 o 127 del ascii(5)
All times are GMT -4. The time now is 05:47 AM.
Unix & Linux Forums Content Copyright 1993-2022. All Rights Reserved.
Privacy Policy