Sponsored Content
Top Forums Shell Programming and Scripting Extracting a portion of a data file with identifier Post 302378659 by Lucky Ali on Tuesday 8th of December 2009 11:56:59 AM
Old 12-08-2009
Extracting a portion of a data file with identifier

Hi,
I do have a TAB delimted text file with the following format.

1 (- identifier of each group. this text is not present in the file only number)
1 3 4 65 56 WERTF
2 3 4 56 56 GHTYHU
3 3 5 64 23 VMFKLG
2
1 3 4 65 56 DGTEYDH
2 3 4 56 56 FJJJCKC
3 3 5 64 23 FNNNCHD
3
1 3 4 65 56 JDHJDH
2 3 4 56 56 FDFDJ
3 3 5 64 23 FHDKF

.
.
.
.
50

1 3 4 56 56 GHTYHU
1 33 4 64 76 WERTF
3 3 5 64 23 VMFKLG

I want to search the entire file with a text, for ex. WERTF (user input) and then output all the lines that have that text in the 6th column along with the identifier.

for ex. if I search for 'WERTF', I would like to have an output:
1 1 3 4 65 56 WERTF
50 1 33 4 64 76 WERTF

where the identifiers are in bold.

Is there a best way to perform this either using regular expressions along with script or awk programming.

Please let me know.
 

10 More Discussions You Might Find Interesting

1. Shell Programming and Scripting

Extracting a portion of data from a very large tab delimited text file

Hi All I wanted to know how to effectively delete some columns in a large tab delimited file. I have a file that contains 5 columns and almost 100,000 rows 3456 f g t t 3456 g h 456 f h 4567 f g h z 345 f g 567 h j k lThis is a very large data file and tab delimited. I need... (2 Replies)
Discussion started by: Lucky Ali
2 Replies

2. Shell Programming and Scripting

Removing a portion of data in a file

Hi, I have a folder that contains many (multiple) files 1.fasta 2.fasta 3.fasta 4.fasta 5.fasta . . 100's of files Each such file have data in the following format for example: vi 1.fasta Code: >AB_1 MLKKPIIIGVTGGSGGGKTSVSRAILDSFPNARIAMIQHDSYYKDQSHMSFEERVKTNYDHPLAFDTDFM (6 Replies)
Discussion started by: Lucky Ali
6 Replies

3. Shell Programming and Scripting

parsing a portion of Data from a text file

Hi All, I need some help to effectively parse out a subset of results from a big results file. Below is an example of the text file. Each block that I need to parse starts with "Output of GENE for sequence file 100.fasta" (next block starts with another number). I have given the portion of... (8 Replies)
Discussion started by: Lucky Ali
8 Replies

4. Shell Programming and Scripting

Help on extracting portion of string

Hi Gurus, I've some sample of my log information as shown below. -> Processing ABCD123456 This is tp version 372.04.57 (release 700, unicode enabled) This is R3trans version 6.14 (release 700 - 05.03.09 - 08:28:00). unicode enabled version R3trans finished (0000). Warning: Parameter... (1 Reply)
Discussion started by: superHonda123
1 Replies

5. Shell Programming and Scripting

Extracting specific lines of data from a file and related lines of data based on a grep value range?

Hi, I have one file, say file 1, that has data like below where 19900107 is the date, 19900107 12 144 129 0.7380047 19900108 12 168 129 0.3149017 19900109 12 192 129 3.2766666E-02 ... (3 Replies)
Discussion started by: Wynner
3 Replies

6. UNIX for Dummies Questions & Answers

Extracting data from file

I am trying to compare the data in lines 3 & 5 to see if they match up to the '-S570' (see first code set, all proprietary information has been removed from code set) spawn telnet Trying ... Connected to CA-LOS1234-ASE-S570.cl . Escape character is '^]'. CA-LOS1234-ASE-S570 Username: ... (1 Reply)
Discussion started by: slipshft
1 Replies

7. Shell Programming and Scripting

Extracting a portion of the string and comparing

I have 2 text files say file1.txt and file2.txt . Some of the sample records for file1.txt were shown below: XXXXX12345XXXXXXX12 3456789YYYYY XXXXXXXXXX12345XX123457485YYYYY XX12345XXXXXXXXXX123454658YYYYY for file2.txt, some of the sample records were shown below: ... (5 Replies)
Discussion started by: bobby1015
5 Replies

8. Shell Programming and Scripting

error while extracting a line from a file based on identifier

here is the content of input file CREATE TABLE `bla bla bla` ( `allianceSiteId` int(11) DEFAULT NULL, `trunkGroupsId` int(11) DEFAULT NULL, `lastModified` timestamp NOT NULL DEFAULT CURRENT_TIMESTAMP ON UPDATE CURRENT_TIMESTAMP, UNIQUE KEY `allianceSiteId`... (4 Replies)
Discussion started by: vivek d r
4 Replies

9. Shell Programming and Scripting

Extracting a portion of the filename

Hi I would like to extract the first portion of filename from a list of files. The filename pattern is of the form 123456789_TEXT_TEXT_TEXT_.csv. I want to extract just the numerical portion of this filename from the list of files and then output this into another text file. K (6 Replies)
Discussion started by: kamal_p_99
6 Replies

10. UNIX for Beginners Questions & Answers

Extracting directory portion.

Dear Experts, I have some directory structure something like follows. I would like to cut portion of it. Would you please help me? I have to run this on several sql's. The directory path is dynamic. I have cut what comes after first "sql" string. Input:... (3 Replies)
Discussion started by: srikanth38
3 Replies
msggen(1)							   User Commands							 msggen(1)

NAME
msggen - generate a machine independent formatted message catalog SYNOPSIS
msggen [-fls] catfile [msgfile] DESCRIPTION
msggen merges the message text source file msgfile into a machine independent formatted message catalog catfile. The file catfile is cre- ated if it does not already exist. If catfile does exist, its messages are included in the new catfile. If set and message numbers collide, the new message text defined in msgfile replaces the old message text currently contained in catfile. Non-ASCII characters must be UTF-8 encoded. iconv(1) can be used to convert to/from UTF-8. OPTIONS
The following options are supported: -f List the printf(3C) format signature for each message in catfile. A format signature is one line containing one character for --format each format specification: c char d double D long double f float h short i int j long long l long p void* s string t ptrdiff_t z size_t ? unknown -l List catfile in UTF-8 msgfile form. --list -s Convert the catfile to a message set number and print the number on the standard output. --set OPERANDS
The following operands are supported: catfile Machine independent formatted message catalog file. msgfile Message text source file. USAGE
Message text source files are in gencat(1) format, defined as follows. The fields of a message text source line are separated by a single blank character. Any other blank characters are considered to be part of the subsequent field. The NL_* constants are defined in one or both of <limits.h> and <nl_types.h>. $ comment A line beginning with a $ followed by a blank character is treated as a comment. $delset n comment This line deletes message set n from an existing message catalog. n denotes the set number [1, NL_SETMAX]. Any text following the set number is treated as a comment. $quote c This line specifies an optional quote character c, which can be used to surround message-text so that trailing spaces or empty messages are visible in a message source line. By default, or if an empty $quote directive is supplied, no quoting of message-text is recog- nized. $set n comment This line specifies the set identifier of the following messages until the next $set or end-of-file (EOF) appears. n denotes the set identifier, which is defined as a number in the range [1, NL_SETMAX]. Set numbers need not be contiguous. Any text following the set identifier is treated as a comment. If no $set directive is specified in a message text source file, all messages are located in mes- sage set 1. $translation identification YYYY-MM-DD[,...] Append translation information to the message catalog header. Only the newest date for a given identification is retained in the cata- log. Multiple translation lines are combined into a single, comma-separated list. m message-text m denotes the message identifier, which is defined as a number in the range [1, NL_MSGMAX]. The message-text is stored in the message catalogue with the set identifier specified by the last $set directive, and with message identifier m. If the message-text is empty, and a blank character field separator is present, an empty string is stored in the message catalogue. If a message source line has a message number, but neither a field separator nor message-text, the existing message with that number (if any) is deleted from the cat- alogue. Message identifiers need not be contiguous. There are no message-text length restrictions. EXIT STATUS
0 Successful completion. >0 One or more specified jobs does not exist. EXAMPLES
Example 1 Using msggen The following example generates a message catalog xxx from the message file xxx.msg: example% msggen xxx xxx.msg AUTHORS
Glenn Fowler, gsf@research.att.com ATTRIBUTES
See attributes(5) for descriptions of the following attributes: +-----------------------------+-----------------------------+ | ATTRIBUTE TYPE | ATTRIBUTE VALUE | +-----------------------------+-----------------------------+ |Availability |SUNWastdev | +-----------------------------+-----------------------------+ |Interface Stability |Volatile | +-----------------------------+-----------------------------+ SEE ALSO
gencat(1), iconv(1), msgcc(1), printf(3C), attributes(5) SunOS 5.11 9 Oct 2007 msggen(1)
All times are GMT -4. The time now is 10:32 PM.
Unix & Linux Forums Content Copyright 1993-2022. All Rights Reserved.
Privacy Policy