Sponsored Content
Full Discussion: Remove duplicate text
Top Forums Shell Programming and Scripting Remove duplicate text Post 302203338 by ripat on Sunday 8th of June 2008 03:33:04 AM
Old 06-08-2008
You should strip the html tags before processing the log file. Different solutions here:

- PHP function strip_tags()
- lynx text browser with option -dump
- html2text utility.
- using sed to get rid of the tags

Can you post an extract of your html file. Or attach it to your reply.
 

10 More Discussions You Might Find Interesting

1. Shell Programming and Scripting

Remove duplicate ???

Hi all, I have a out.log file CARR|02/26/2006 10:58:30.107|CDxAcct=1405157051 CARR|02/26/2006 11:11:30.107|CDxAcct=1405157051 CARR|02/26/2006 11:18:30.107|CDxAcct=7659579782 CARR|02/26/2006 11:28:30.107|CDxAcct=9534922327 CARR|02/26/2006 11:38:30.107|CDxAcct=9534922327 CARR|02/26/2006... (3 Replies)
Discussion started by: sabercats
3 Replies

2. Shell Programming and Scripting

Remove duplicate

Hi all, I have a text file fileA.txt DXRV|02/28/2006 11:36:49.049|SAC||||CDxAcct=2420991350 DXRV|02/28/2006 11:37:06.404|SAC||||CDxAcct=6070970034 DXRV|02/28/2006 11:37:25.740|SAC||||CDxAcct=2420991350 DXRV|02/28/2006 11:38:32.633|SAC||||CDxAcct=6070970034 DXRV|02/28/2006... (2 Replies)
Discussion started by: sabercats
2 Replies

3. Shell Programming and Scripting

Remove duplicate files based on text string?

Hi I have been struggling with a script for removing duplicate messages from a shared mailbox. I would like to search for duplicate messages based on the “Message-ID” string within the messages files. I have managed to find the duplicate “Message-ID” strings and (if I would like) delete... (1 Reply)
Discussion started by: spangberg
1 Replies

4. Shell Programming and Scripting

remove duplicate

Hi, I am tryung to use shell or perl to remove duplicate characters for example , if I have " I love google" it will become I love ggle" or even "I loveggle" if removing duplicate white space Thanks CC (6 Replies)
Discussion started by: ccp
6 Replies

5. Shell Programming and Scripting

Filter or remove duplicate block of text without distinguishing marks or fields

Hello, Although I have found similar questions, I could not find advice that could help with our problem. The issue: We have several hundreds text files containing repeated blocks of text (I guess back at the time they were prepared like that to optmize printing). The block of texts... (13 Replies)
Discussion started by: samask
13 Replies

6. Shell Programming and Scripting

Remove duplicate

Hi , I have a pipe seperated file repo.psv where i need to remove duplicates based on the 1st column only. Can anyone help with a Unix script ? Input: 15277105||Common Stick|ESHR||Common Stock|CYRO AB 15277105||Common Stick|ESHR||Common Stock|CYRO AB 16111278||Common Stick|ESHR||Common... (12 Replies)
Discussion started by: samrat dutta
12 Replies

7. UNIX for Dummies Questions & Answers

Remove duplicate

Hi, How can I replace || with space and then remove duplicate from following text? T111||T222||T444||T222||T555 Thanks in advance (10 Replies)
Discussion started by: tinku981
10 Replies

8. Windows & DOS: Issues & Discussions

Remove duplicate lines from text files.

So, I have text files, one "fail.txt" And one "color.txt" I now want to use a command line (DOS) to remove ANY line that is PRESENT IN BOTH from each text file. Afterwards there shall be no duplicate lines. (1 Reply)
Discussion started by: pasc
1 Replies

9. Shell Programming and Scripting

How to remove duplicate text blocks from a file?

Hi All I have a list of files which will have duplicate list of blocks of text. Following is a sample of the file, I have removed the sensitive information from the file. All the code samples starts from <TR BGCOLOR="white"> and Ends with IP address and two html tags like this. 10.14.22.22... (3 Replies)
Discussion started by: mahasona
3 Replies

10. Shell Programming and Scripting

Remove duplicate occurrences of text pattern

Hi folks! I have a file which contains a 1000 lines. On each line i have multiple occurrences ( 26 to be exact ) of pattern folder#/folder#. # is depicting the line number in the file some text here folder1/folder1 some text here folder1/folder1 some text here folder1/folder1 some text... (7 Replies)
Discussion started by: martinsmith
7 Replies
hman(1) 						      General Commands Manual							   hman(1)

NAME
hman - browse the on-line manual pages SYNOPSIS
hman [ -P browser ] [ -H host ] [ section ] name hman [ -P browser ] [ -H host ] [ section ] [ index ] DESCRIPTION
The hman script is an interface to man2html(1) that allows you to enter man page requests at the command line and view the output in your favourite browser. The behaviour reminds of that of man(1) so that many people will be able to alias hman to man. If the browser used is netscape, and an incarnation of netscape is running already, hman will pass the request to the existing browser. OPTIONS
-P browser Specify which browser (like lynx, xmosaic, arena, chimera, netscape, amaya, ...) to use. This option overrides the MANHTMLPAGER environment variable. The default is the non-httpd version of lynx, or sensible-browser if lynx cannot be found. -H host Specify from what host to get the man pages. This option overrides the MANHTMLHOST environment variable. The default is localhost. ENVIRONMENT
MANHTMLPAGER The default browser to use is selected using this environment variable. MANHTMLHOST The default host to use is selected using this environment variable. SEE ALSO
man(1), man2html(1), arena(1), lynx(1), sensible-browser(1), netscape(1), xmosaic(1), glimpse(1) http://www.mcom.com/newsref/std/x-remote.html 19 January 1998 hman(1)
All times are GMT -4. The time now is 06:05 PM.
Unix & Linux Forums Content Copyright 1993-2022. All Rights Reserved.
Privacy Policy