Sponsored Content
Top Forums Shell Programming and Scripting Extracting a portion of a data file with identifier Post 302378659 by Lucky Ali on Tuesday 8th of December 2009 11:56:59 AM
Old 12-08-2009
Extracting a portion of a data file with identifier

Hi,
I do have a TAB delimted text file with the following format.

1 (- identifier of each group. this text is not present in the file only number)
1 3 4 65 56 WERTF
2 3 4 56 56 GHTYHU
3 3 5 64 23 VMFKLG
2
1 3 4 65 56 DGTEYDH
2 3 4 56 56 FJJJCKC
3 3 5 64 23 FNNNCHD
3
1 3 4 65 56 JDHJDH
2 3 4 56 56 FDFDJ
3 3 5 64 23 FHDKF

.
.
.
.
50

1 3 4 56 56 GHTYHU
1 33 4 64 76 WERTF
3 3 5 64 23 VMFKLG

I want to search the entire file with a text, for ex. WERTF (user input) and then output all the lines that have that text in the 6th column along with the identifier.

for ex. if I search for 'WERTF', I would like to have an output:
1 1 3 4 65 56 WERTF
50 1 33 4 64 76 WERTF

where the identifiers are in bold.

Is there a best way to perform this either using regular expressions along with script or awk programming.

Please let me know.
 

10 More Discussions You Might Find Interesting

1. Shell Programming and Scripting

Extracting a portion of data from a very large tab delimited text file

Hi All I wanted to know how to effectively delete some columns in a large tab delimited file. I have a file that contains 5 columns and almost 100,000 rows 3456 f g t t 3456 g h 456 f h 4567 f g h z 345 f g 567 h j k lThis is a very large data file and tab delimited. I need... (2 Replies)
Discussion started by: Lucky Ali
2 Replies

2. Shell Programming and Scripting

Removing a portion of data in a file

Hi, I have a folder that contains many (multiple) files 1.fasta 2.fasta 3.fasta 4.fasta 5.fasta . . 100's of files Each such file have data in the following format for example: vi 1.fasta Code: >AB_1 MLKKPIIIGVTGGSGGGKTSVSRAILDSFPNARIAMIQHDSYYKDQSHMSFEERVKTNYDHPLAFDTDFM (6 Replies)
Discussion started by: Lucky Ali
6 Replies

3. Shell Programming and Scripting

parsing a portion of Data from a text file

Hi All, I need some help to effectively parse out a subset of results from a big results file. Below is an example of the text file. Each block that I need to parse starts with "Output of GENE for sequence file 100.fasta" (next block starts with another number). I have given the portion of... (8 Replies)
Discussion started by: Lucky Ali
8 Replies

4. Shell Programming and Scripting

Help on extracting portion of string

Hi Gurus, I've some sample of my log information as shown below. -> Processing ABCD123456 This is tp version 372.04.57 (release 700, unicode enabled) This is R3trans version 6.14 (release 700 - 05.03.09 - 08:28:00). unicode enabled version R3trans finished (0000). Warning: Parameter... (1 Reply)
Discussion started by: superHonda123
1 Replies

5. Shell Programming and Scripting

Extracting specific lines of data from a file and related lines of data based on a grep value range?

Hi, I have one file, say file 1, that has data like below where 19900107 is the date, 19900107 12 144 129 0.7380047 19900108 12 168 129 0.3149017 19900109 12 192 129 3.2766666E-02 ... (3 Replies)
Discussion started by: Wynner
3 Replies

6. UNIX for Dummies Questions & Answers

Extracting data from file

I am trying to compare the data in lines 3 & 5 to see if they match up to the '-S570' (see first code set, all proprietary information has been removed from code set) spawn telnet Trying ... Connected to CA-LOS1234-ASE-S570.cl . Escape character is '^]'. CA-LOS1234-ASE-S570 Username: ... (1 Reply)
Discussion started by: slipshft
1 Replies

7. Shell Programming and Scripting

Extracting a portion of the string and comparing

I have 2 text files say file1.txt and file2.txt . Some of the sample records for file1.txt were shown below: XXXXX12345XXXXXXX12 3456789YYYYY XXXXXXXXXX12345XX123457485YYYYY XX12345XXXXXXXXXX123454658YYYYY for file2.txt, some of the sample records were shown below: ... (5 Replies)
Discussion started by: bobby1015
5 Replies

8. Shell Programming and Scripting

error while extracting a line from a file based on identifier

here is the content of input file CREATE TABLE `bla bla bla` ( `allianceSiteId` int(11) DEFAULT NULL, `trunkGroupsId` int(11) DEFAULT NULL, `lastModified` timestamp NOT NULL DEFAULT CURRENT_TIMESTAMP ON UPDATE CURRENT_TIMESTAMP, UNIQUE KEY `allianceSiteId`... (4 Replies)
Discussion started by: vivek d r
4 Replies

9. Shell Programming and Scripting

Extracting a portion of the filename

Hi I would like to extract the first portion of filename from a list of files. The filename pattern is of the form 123456789_TEXT_TEXT_TEXT_.csv. I want to extract just the numerical portion of this filename from the list of files and then output this into another text file. K (6 Replies)
Discussion started by: kamal_p_99
6 Replies

10. UNIX for Beginners Questions & Answers

Extracting directory portion.

Dear Experts, I have some directory structure something like follows. I would like to cut portion of it. Would you please help me? I have to run this on several sql's. The directory path is dynamic. I have cut what comes after first "sql" string. Input:... (3 Replies)
Discussion started by: srikanth38
3 Replies
GENCAT(1)						    BSD General Commands Manual 						 GENCAT(1)

NAME
gencat -- NLS catalog compiler SYNOPSIS
gencat output-file input-files... DESCRIPTION
The gencat utility merges the text NLS input files input-files... into a formatted message catalog file output-file. The file output-file will be created if it does not already exist. If output-file does exist, its messages will be included in the new output-file. If set and message numbers collide, the new message text defined in input-files... will replace the old message text currently contained in output-file. INPUT FILES
The format of a message text source file is defined below. Note that the fields of a message text source line are separated by a single space character: any other space characters are considered to be part of the field contents. $set n comment This line specifies the set identifier of the following messages until the next $set or end-of-file appears. The argument n is the set identifier which is defined as a number in the range [1, (NL_SETMAX)]. Set identifiers must occur in ascending order within a single source file, but need not be contiguous. Any string following a space following the set identifier is treated as a comment. If no $set directive is specified in a given source file, all messages will be located in the default message set NL_SETD. $del n comment This line deletes messages from set n from a message catalog. The n specifies a set number. Any string following a space following the set number is treated as a comment. $ comment A line beginning with $ followed by a space is treated as a comment. m message-text A message line consists of a message identifier m in the range [1, (NL_MSGMAX)]. The message-text is stored in the message catalog with the set identifier specified by the last $set directive, and the message identifier m. If the message-text is empty, and there is a space character following the message identifier, an empty string is stored in the message catalog. If the message-text is empty, and if there is no space character following the message identifier, then the existing message in the current set with the specified message identifier is deleted from the catalog. Message identifiers must be in ascending order within a single set, but need not be contiguous. The message-text length must be in the range [0, (NL_TEXTMAX)]. $quote c This line specifies an optional quote character c which can be used to surround message-text so that trailing space or empty messages are visible in message source files. By default, or if an empty $quote directive is specified, no quoting of message-text will be rec- ognized. Empty lines in message source files are ignored. The effect of lines beginning with any character other than those described above is unde- fined. Text strings can contain the following special characters and escape sequences. In addition, if a quote character is defined, it may be escaped as well to embed a literal quote character. line feed horizontal tab v vertical tab  backspace carriage return f form feed \ backslash ooo octal number in the range [000, 377] A backslash character immediately before the end of the line in a file is used to continue the line onto the next line, e.g.: 1 This line is continued on this line. If the character following the backslash is not one of those specified, the backslash is ignored. DIAGNOSTICS
The gencat utility exits 0 on success, and >0 if an error occurs. SEE ALSO
catclose(3), catgets(3), catopen(3) STANDARDS
The gencat utility is compliant with the X/Open Portability Guide Issue 4 (``XPG4'') standard. AUTHORS
This manual page was originally written by Ken Stailey and later revised by Terry Lambert. BUGS
A message catalog file created from a blank input file cannot be revised; it must be deleted and recreated. BSD
June 11, 1997 BSD
All times are GMT -4. The time now is 11:00 PM.
Unix & Linux Forums Content Copyright 1993-2022. All Rights Reserved.
Privacy Policy