Sponsored Content
Top Forums Shell Programming and Scripting filtering out duplicate substrings, regex string from a string Post 302429744 by kchinnam on Tuesday 15th of June 2010 11:18:35 AM
Old 06-15-2010
Java filtering out duplicate substrings, regex string from a string

My input contains a single word lines.
From each line
Quote:
a) I want to remove all text that starts with 'dp' including 'dp'.
Ex: prjgoodBlaBladpgoodBlaBla ---> prjgoodBlaBla
b) Also I want to remove duplicate substrings.
Ex: prjtestBlaBlatestBlaBla ---> prjtestBlaBla
Logic I have in mind but having hard time implementing: Take 4 thru 10 characters [testBla] , if its found in the string, remove all text starting from second occurance of it.
data.txt
Code:
 
prjtestBlaBlatestBlaBla
prjthisBlaBlathisBlaBla
prjthatBlaBladpthatBlaBla
prjgoodBlaBladpgoodBlaBla
prjgood1BlaBla123dpgood1BlaBla123


Desired output -->
data_out.txt
Code:
 
prjtestBlaBla
prjthisBlaBla
prjthatBlaBla
prjgoodBlaBla
prjgood1BlaBla123

I am able to get part a) of my requirement working using following,,
Code:
 
> sed 's/dp\(.*\)\..*/\1/' data.txt
prjtestBlaBlatestBlaBla
prjthisBlaBlathisBlaBla
prjthatBlaBladpthatBlaBla
prjgoodBlaBladpgoodBlaBla
prjgood1BlaBla123dpgood1BlaBla123

but not part b).

Last edited by kchinnam; 06-15-2010 at 12:19 PM.. Reason: formatting changes
 

10 More Discussions You Might Find Interesting

1. UNIX for Dummies Questions & Answers

Filtering text from a string

I'm trying to write a script which prints out the users who are loged in. Printing the output of the "users" command isn't the problem. What I want is to filter out my own username. users | grep -v (username) does not work because the whole line in which username exists is suppressed. If... (5 Replies)
Discussion started by: Cozmic
5 Replies

2. Shell Programming and Scripting

Need help in string filtering (KSH)

Hi all, I'm interested in printing out only the prefix of a formatted set of filenames. All files of this type have the same 8 character suffix. I'm using KSH. Is there a command I could use to print the filenames, less the last 8 characters? Was thinking of using sed 's/<last 8 chars>//',... (1 Reply)
Discussion started by: rockysfr
1 Replies

3. Shell Programming and Scripting

filtering string

hlow all i need help for my case i want to get variable 20(in bold) but filter in print $3 not $2 so this input 95:20111005_20111123:1821546322 96:20111005_20111123:0053152068 97:20111005_20111123:1820960407 98:20111005_20111123:2021153102 99:20111005_20111123:2021153202... (4 Replies)
Discussion started by: zvtral
4 Replies

4. Shell Programming and Scripting

sed or awk command to replace a string pattern with another string based on position of this string

here is what i want to achieve... consider a file contains below contents. the file size is large about 60mb cat dump.sql INSERT INTO `table1` (`id`, `action`, `date`, `descrip`, `lastModified`) VALUES (1,'Change','2011-05-05 00:00:00','Account Updated','2012-02-10... (10 Replies)
Discussion started by: vivek d r
10 Replies

5. Shell Programming and Scripting

Filtering protocol and string in tcpdump command?

Hello to all in forum, Maybe some unix expert could help me. I have the following tcpdump command: tcpdump -i any port 13907 -s 0 -w Out.cap I would like to run tcpdump to only capture data related with especific string. Within the dump the protocol is GSM MAP and the string is Address... (0 Replies)
Discussion started by: cgkmal
0 Replies

6. Shell Programming and Scripting

KSH: Split String into smaller substrings based on count

KSH HP-SOL-Lin Cannot use xAWK I have several strings that are quite long and i want to break them down into smaller substrings. What I have String = "word1 word2 word3 word4 .....wordx" What I want String1="word1 word2" String2="word 3 word4" String3="word4 word5" Stringx="wordx... (5 Replies)
Discussion started by: nitrobass24
5 Replies

7. Shell Programming and Scripting

Extracting substrings from a string of variable length

I have a string like Months=jan feb mar april x y .. Here the number of fields in Months is not definite I need to extract each field in the Months string and pass it to awk . Don't want to use for in since it is a loop . How can i do it (2 Replies)
Discussion started by: Nevergivup
2 Replies

8. Shell Programming and Scripting

Need Help of filtering string from a file.

HI All, We have an Redhat Machine, And some folder with couple simple text files, this files containing a lot of lines with various strings and IP address with different classes. The Requirement in eventually , is to pass the all various IP addresses to Excel. My question is : what is... (4 Replies)
Discussion started by: James Stone
4 Replies

9. Shell Programming and Scripting

Remove not only the duplicate string but also the keyword of the string in Perl

Hi Perl users, I have another problem with text processing in Perl. I have a file below: Linux Unix Linux Windows SUN MACOS SUN SUN HP-AUX I want the result below: Unix Windows SUN MACOS HP-AUX so the duplicate string will be removed and also the keyword of the string on... (2 Replies)
Discussion started by: askari
2 Replies

10. Shell Programming and Scripting

Grep with regex containing one string but not the other

Hi to you all, I'm just struggling with a regex problem and I'm pretty sure that I'm missing sth obvious... :confused: I need a regex to feed my grep in order to find lines that contain one string but not the other. Here's the data example: 2015-04-08 19:04:55,926|xxxxxxxxxx| ... (11 Replies)
Discussion started by: stresing
11 Replies
EREG(3) 								 1								   EREG(3)

ereg - Regular expression match

SYNOPSIS
int ereg (string $pattern, string $string, [array &$regs]) DESCRIPTION
Searches a $string for matches to the regular expression given in $pattern in a case-sensitive way. Warning This function has been DEPRECATED as of PHP 5.3.0. Relying on this feature is highly discouraged. PARAMETERS
o $pattern - Case sensitive regular expression. o $string - The input string. o $regs - If matches are found for parenthesized substrings of $pattern and the function is called with the third argument $regs, the matches will be stored in the elements of the array $regs. $regs[1] will contain the substring which starts at the first left parenthesis; $regs[2] will contain the substring starting at the second, and so on. $regs[0] will contain a copy of the complete string matched. RETURN VALUES
Returns the length of the matched string if a match for $pattern was found in $string, or FALSE if no matches were found or an error occurred. If the optional parameter $regs was not passed or the length of the matched string is 0, this function returns 1. CHANGELOG
+--------+---------------------------------------------------+ |Version | | | | | | | Description | | | | +--------+---------------------------------------------------+ | 4.1.0 | | | | | | | Up to (and including) PHP 4.1.0 $regs will be | | | filled with exactly ten elements, even though | | | more or fewer than ten parenthesized substrings | | | may actually have matched. This has no effect on | | | ereg(3)'s ability to match more substrings. If no | | | matches are found, $regs will not be altered by | | | ereg(3). | | | | +--------+---------------------------------------------------+ EXAMPLES
Example #1 ereg(3) example The following code snippet takes a date in ISO format (YYYY-MM-DD) and prints it in DD.MM.YYYY format: <?php if (ereg ("([0-9]{4})-([0-9]{1,2})-([0-9]{1,2})", $date, $regs)) { echo "$regs[3].$regs[2].$regs[1]"; } else { echo "Invalid date format: $date"; } ?> NOTES
Note As of PHP 5.3.0, the regex extension is deprecated in favor of the PCRE extension. Calling this function will issue an E_DEPRECATED notice. See the list of differences for help on converting to PCRE. Tip ereg(3) is deprecated as of PHP 5.3.0. preg_match(3) is the suggested alternative to this function. SEE ALSO
eregi(3), ereg_replace(3), eregi_replace(3), preg_match(3), strpos(3), strstr(3), quotemeta(3). PHP Documentation Group EREG(3)
All times are GMT -4. The time now is 04:36 AM.
Unix & Linux Forums Content Copyright 1993-2022. All Rights Reserved.
Privacy Policy