12-08-2009
Extracting a portion of a data file with identifier
Hi,
I do have a TAB delimted text file with the following format.
1 (- identifier of each group. this text is not present in the file only number)
1 3 4 65 56 WERTF
2 3 4 56 56 GHTYHU
3 3 5 64 23 VMFKLG
2
1 3 4 65 56 DGTEYDH
2 3 4 56 56 FJJJCKC
3 3 5 64 23 FNNNCHD
3
1 3 4 65 56 JDHJDH
2 3 4 56 56 FDFDJ
3 3 5 64 23 FHDKF
.
.
.
.
50
1 3 4 56 56 GHTYHU
1 33 4 64 76 WERTF
3 3 5 64 23 VMFKLG
I want to search the entire file with a text, for ex. WERTF (user input) and then output all the lines that have that text in the 6th column along with the identifier.
for ex. if I search for 'WERTF', I would like to have an output:
1 1 3 4 65 56 WERTF
50 1 33 4 64 76 WERTF
where the identifiers are in bold.
Is there a best way to perform this either using regular expressions along with script or awk programming.
Please let me know.
10 More Discussions You Might Find Interesting
1. Shell Programming and Scripting
Hi All
I wanted to know how to effectively delete some columns in a large tab delimited file.
I have a file that contains 5 columns and almost 100,000 rows
3456 f g t t
3456 g h
456 f h
4567 f g h z
345 f g
567 h j k lThis is a very large data file and tab delimited.
I need... (2 Replies)
Discussion started by: Lucky Ali
2 Replies
2. Shell Programming and Scripting
Hi,
I have a folder that contains many (multiple) files
1.fasta
2.fasta
3.fasta
4.fasta
5.fasta
.
.
100's of files
Each such file have data in the following format
for example:
vi 1.fasta
Code:
>AB_1
MLKKPIIIGVTGGSGGGKTSVSRAILDSFPNARIAMIQHDSYYKDQSHMSFEERVKTNYDHPLAFDTDFM (6 Replies)
Discussion started by: Lucky Ali
6 Replies
3. Shell Programming and Scripting
Hi All,
I need some help to effectively parse out a subset of results from a big results file.
Below is an example of the text file. Each block that I need to parse starts with "Output of GENE for sequence file 100.fasta" (next block starts with another number). I have given the portion of... (8 Replies)
Discussion started by: Lucky Ali
8 Replies
4. Shell Programming and Scripting
Hi Gurus,
I've some sample of my log information as shown below.
-> Processing ABCD123456
This is tp version 372.04.57 (release 700, unicode enabled)
This is R3trans version 6.14 (release 700 - 05.03.09 - 08:28:00).
unicode enabled version
R3trans finished (0000).
Warning: Parameter... (1 Reply)
Discussion started by: superHonda123
1 Replies
5. Shell Programming and Scripting
Hi,
I have one file, say file 1, that has data like below where 19900107 is the date,
19900107 12 144 129 0.7380047
19900108 12 168 129 0.3149017
19900109 12 192 129 3.2766666E-02
... (3 Replies)
Discussion started by: Wynner
3 Replies
6. UNIX for Dummies Questions & Answers
I am trying to compare the data in lines 3 & 5 to see if they match up to the '-S570' (see first code set, all proprietary information has been removed from code set)
spawn telnet
Trying ...
Connected to CA-LOS1234-ASE-S570.cl .
Escape character is '^]'.
CA-LOS1234-ASE-S570
Username: ... (1 Reply)
Discussion started by: slipshft
1 Replies
7. Shell Programming and Scripting
I have 2 text files say file1.txt and file2.txt . Some of the sample records for file1.txt were shown below:
XXXXX12345XXXXXXX12 3456789YYYYY
XXXXXXXXXX12345XX123457485YYYYY
XX12345XXXXXXXXXX123454658YYYYY
for file2.txt, some of the sample records were shown below:
... (5 Replies)
Discussion started by: bobby1015
5 Replies
8. Shell Programming and Scripting
here is the content of input file
CREATE TABLE `bla bla bla` (
`allianceSiteId` int(11) DEFAULT NULL,
`trunkGroupsId` int(11) DEFAULT NULL,
`lastModified` timestamp NOT NULL DEFAULT CURRENT_TIMESTAMP ON UPDATE CURRENT_TIMESTAMP,
UNIQUE KEY `allianceSiteId`... (4 Replies)
Discussion started by: vivek d r
4 Replies
9. Shell Programming and Scripting
Hi
I would like to extract the first portion of filename from a list of files.
The filename pattern is of the form 123456789_TEXT_TEXT_TEXT_.csv. I want to extract just the numerical portion of this filename from the list of files and then output this into another text file.
K (6 Replies)
Discussion started by: kamal_p_99
6 Replies
10. UNIX for Beginners Questions & Answers
Dear Experts,
I have some directory structure something like follows. I would like to cut portion of it. Would you please help me? I have to run this on several sql's. The directory path is dynamic. I have cut what comes after first "sql" string.
Input:... (3 Replies)
Discussion started by: srikanth38
3 Replies
GENCAT(1) BSD General Commands Manual GENCAT(1)
NAME
gencat -- NLS catalog compiler
SYNOPSIS
gencat output-file input-files...
DESCRIPTION
The gencat utility merges the text NLS input files input-files... into a formatted message catalog file output-file. The file output-file
will be created if it does not already exist. If output-file does exist, its messages will be included in the new output-file. If set and
message numbers collide, the new message text defined in input-files... will replace the old message text currently contained in output-file.
INPUT FILES
The format of a message text source file is defined below. Note that the fields of a message text source line are separated by a single
space character: any other space characters are considered to be part of the field contents.
$set n comment
This line specifies the set identifier of the following messages until the next $set or end-of-file appears. The argument n is the set
identifier which is defined as a number in the range [1, (NL_SETMAX)]. Set identifiers must occur in ascending order within a single
source file, but need not be contiguous. Any string following a space following the set identifier is treated as a comment. If no $set
directive is specified in a given source file, all messages will be located in the default message set NL_SETD.
$del n comment
This line deletes messages from set n from a message catalog. The n specifies a set number. Any string following a space following the
set number is treated as a comment.
$ comment
A line beginning with $ followed by a space is treated as a comment.
m message-text
A message line consists of a message identifier m in the range [1, (NL_MSGMAX)]. The message-text is stored in the message catalog with
the set identifier specified by the last $set directive, and the message identifier m. If the message-text is empty, and there is a
space character following the message identifier, an empty string is stored in the message catalog. If the message-text is empty, and
if there is no space character following the message identifier, then the existing message in the current set with the specified message
identifier is deleted from the catalog. Message identifiers must be in ascending order within a single set, but need not be contiguous.
The message-text length must be in the range [0, (NL_TEXTMAX)].
$quote c
This line specifies an optional quote character c which can be used to surround message-text so that trailing space or empty messages
are visible in message source files. By default, or if an empty $quote directive is specified, no quoting of message-text will be rec-
ognized.
Empty lines in message source files are ignored. The effect of lines beginning with any character other than those described above is unde-
fined.
Text strings can contain the following special characters and escape sequences. In addition, if a quote character is defined, it may be
escaped as well to embed a literal quote character.
line feed
horizontal tab
v vertical tab
backspace
carriage return
f form feed
\ backslash
ooo octal number in the range [000, 377]
A backslash character immediately before the end of the line in a file is used to continue the line onto the next line, e.g.:
1 This line is continued
on this line.
If the character following the backslash is not one of those specified, the backslash is ignored.
DIAGNOSTICS
The gencat utility exits 0 on success, and >0 if an error occurs.
SEE ALSO
catclose(3), catgets(3), catopen(3)
STANDARDS
The gencat utility is compliant with the X/Open Portability Guide Issue 4 (``XPG4'') standard.
AUTHORS
This manual page was originally written by Ken Stailey and later revised by Terry Lambert.
BUGS
A message catalog file created from a blank input file cannot be revised; it must be deleted and recreated.
BSD
June 11, 1997 BSD