Sponsored Content
Top Forums UNIX for Dummies Questions & Answers Issue when using egrep to extract strings (too many strings) Post 302971486 by Scrutinizer on Wednesday 20th of April 2016 06:34:24 PM
Old 04-20-2016
You could try putting those strings in a file, like so:

Code:
ILMN_2258774
ILMN_1700477
...
ILMN_1805992

Then you can extract like so:
Code:
grep -f stringfile test1>test2

For accuracy it would be better to use anchoring, by using a single space after each of the strings (ILMN_ is unique enough so the does not need to be a ^ in front) , to avoid possible false positives because of substring matches, unless all strings have the same length:

Code:
ILMN_2258774 
ILMN_1700477 
...
ILMN_1805992

--
On Solaris use /usr/xpg4/bin/grep

Last edited by Scrutinizer; 04-20-2016 at 08:23 PM..
This User Gave Thanks to Scrutinizer For This Post:
 

10 More Discussions You Might Find Interesting

1. UNIX for Dummies Questions & Answers

Delete strings in file1 based on the list of strings in file2

Hello guys, should be a very easy questn for you: I need to delete strings in file1 based on the list of strings in file2. like file2: word1_word2_ word3_word5_ word3_word4_ word6_word7_ file1: word1_word2_otherwords..,word3_word5_others... (7 Replies)
Discussion started by: roussine
7 Replies

2. Shell Programming and Scripting

How to Extract text between two strings?

Hi, I want to extract some text between two strings in a line i am using following command i.e; awk '/-string1/,/-string2/' filename contents of file is--- line1 line2 aaa -bbb -ccc -string1 c,d,e -string2 line4 but it is showing complete line which is having searched strings. aaa... (19 Replies)
Discussion started by: emresearch
19 Replies

3. Shell Programming and Scripting

Delete lines in file containing duplicate strings, keeping longer strings

The question is not as simple as the title... I have a file, it looks like this <string name="string1">RZ-LED</string> <string name="string2">2.0</string> <string name="string2">Version 2.0</string> <string name="string3">BP</string> I would like to check for duplicate entries of... (11 Replies)
Discussion started by: raidzero
11 Replies

4. Shell Programming and Scripting

Egrep strings on different lines in file

test.txt: appleboy orangeletter sweetdeal catracer conducivelot I want to only grep out lines that contain "appleboy" AND "sweetdeal". however, the closest thing to this that i can think of is this: cat test.txt | egrep "appleboy|sweetdeal" problem is this only searches for all... (9 Replies)
Discussion started by: SkySmart
9 Replies

5. Shell Programming and Scripting

Extract two strings from a file and create a new file with these strings

I have the following lines in a log file. It would be great if some one can help me to create a new file with the just entries in the below format. 66.150.161.195 HPSAC=Z05 66.150.161.196 HPSAC=A05 That is just extract the IP address and the string DPSAC=its value 66.150.161.195 -... (1 Reply)
Discussion started by: Tuxidow
1 Replies

6. UNIX for Dummies Questions & Answers

Extract code between 2 strings.

Hi, Im having some problems with this. I have loaded a file with html code. All code is placed in the same line. I want to get everything between two given strings (including these strings and get only the first appearance). Example: File contains <html><body><a href='a.html'>abc</a><a... (5 Replies)
Discussion started by: ngb
5 Replies

7. Shell Programming and Scripting

Exclude lines in a file with matches with multiple Strings using egrep

Hi I have a txt file and I would like to use egrep without using -v option to exclude the lines which matches with multiple Strings. Let's say I have some text in the txt file. The command should not fetch lines if they have strings something like CAT MAT DAT The command should fetch me... (4 Replies)
Discussion started by: Sathwik
4 Replies

8. UNIX for Beginners Questions & Answers

How to pass strings from a list of strings from another file and create multiple files?

Hello Everyone , Iam a newbie to shell programming and iam reaching out if anyone can help in this :- I have two files 1) Insert.txt 2) partition_list.txt insert.txt looks like this :- insert into emp1 partition (partition_name) (a1, b2, c4, s6, d8) select a1, b2, c4, (2 Replies)
Discussion started by: nubie2linux
2 Replies

9. UNIX for Beginners Questions & Answers

Extract content between strings

Hello i am stuck with this. i have input which is as follows /type/work /works/OL10627594W 3 2019-04-24T16:46:21.351549 {"created": {"type": "/type/datetime", "value": "2009-12-11T03:18:17.488715"}, "title": "Tog the dog", "covers": , "last_modified": {"type":... (3 Replies)
Discussion started by: ahfze
3 Replies

10. Shell Programming and Scripting

Extract strings from output

I am having the following output when executing a dig command : dig @1.1.1.1 google.com +noall +answer +stats ; <<>> DiG 9.11.4-P1 <<>> @1.1.1.1 google.com +noall +answer +stats ; (1 server found) ;; global options: +cmd obodrm.prod.at.dmdsdp.com. 86154 IN A ... (1 Reply)
Discussion started by: liviusbr
1 Replies
strextract(1)						      General Commands Manual						     strextract(1)

NAME
strextract - batch string extraction SYNOPSIS
strextract [-p patternfile] [-i ignorefile] [-d] [source-program...] OPTIONS
Ignore text strings specified in ignorefile. By default, the strextract command searches for ignorefile in the current working directory, your home directory, and /usr/lib/nls. If you omit the -i option, strextract recognizes all strings specified in the patterns file. Use patternfile to match strings in the input source program. By default, the command searches for the pattern file in the current working directory, your home direc- tory, and finally /usr/lib/nls. If you omit the -p option, the strextract command uses a default patterns file that is stored in /usr/lib/nls/patterns. Disables warnings of duplicate strings. If you omit the -d option, strextract prints warnings of duplicate strings in your source program. DESCRIPTION
The strextract command extracts text strings from source programs. This command also writes the string it extracts to a message text file. The message text file contains the text for each message extracted from your input source program. The strextract command names the file by appending to the name of the input source program. In the source-program argument, you name one or more source programs from which you want messages extracted. The strextract command does not extract messages from source programs included using the #include directive. Therefore, you might want a source program and all the source programs it includes on a single strextract command line. You can create a patterns file (as specified by patternfile ) to control how the strextract command extracts text. The patterns file is divided into several sections, each of which is identified by a keyword. The keyword must start at the beginning of a new line, and its first character must be a dollar sign ($). Following the identifier, you specify a number of patterns. Each pattern begins on a new line and follows the regular expression syntax you use in the regexp(3) routine. For more information on the patterns file, see the patterns(4) reference page. In addition to the patterns file, you can create a file that indicates strings that extract ignores. Each line in this ignore file con- tains a single string to be ignored that follows the syntax of the regexp(3) routine. When you invoke the strextract command, it reads the patterns file and the file that contains strings it ignores. You can specify a pat- terns file and an ignore file on the strextract command line. Otherwise, the strextract command matches all strings and uses the default patterns file. If strextract finds strings which match the ERROR directive in the pattern file, it reports the strings to standard error (stderr.) but does not write the string to the message file. After running strextract, you can edit the message text file to remove text strings which do not need translating before running strmerge. It is recommended that you use extract command as a visual front end to the strextract command rather than running strextract directly. RESTRICTIONS
Given the default pattern file, you cannot cause strextract to ignore strings in comments that are longer than one line. You can specify only one rewrite string for all classes of pattern matches. The strextract command does not extract strings from files include with #include directive. You must run the strextract commands on these files separately. % strextract -p c_patterns prog.c prog2.c % vi prog.str % strmerge -p c_patterns prog.c prog2.c % gencat prog.cat prog.msg prog2.msg % vi nl_prog.c % vi nl_prog2.c % cc nl_prog.c nl_prog2.c In this example, the strextract command uses the c_patterns file to determine which strings to match. The input source programs are named prog.c and prog2.c. If you need to remove any of the messages or extract one of the created strings, edit the resulting message file, prog.str. Under no condi- tions should you add to this file. Doing so could result in unpredictable behavior. You issue the strmerge command to replace the extracted strings with calls to the message catalog. In response to this command, strmerge, creates the source message catalogs, prog.msg and prog2.msg, and the output source programs, nl_prog.c and nl_prog2.c. You must edit nl_prog.c and nl_prog2.c to include the appropriate catopen and catclose function calls. The gencat command creates a message catalog and the cc command creates an executable program. SEE ALSO
gencat(1), extract(1), strmerge(1), regexp(3), catopen(3), patterns(4) Writing Software for the International Market strextract(1)
All times are GMT -4. The time now is 05:32 AM.
Unix & Linux Forums Content Copyright 1993-2022. All Rights Reserved.
Privacy Policy