I need help with a script that will remove all HTML tags from an HTML document and remove any consecutive duplicate lines, and save it as a text document. The user should have the option of including the name of an html file as an argument for the script, but if none is provided, then the script should prompt the user for the file name.
So far I have
not sure how to combine that with code to remove consecutive duplicate lines
I have following file content (3 fields each line):
23 888 10.0.0.1
dfh 787 10.0.0.2
dssf dgfas 10.0.0.3
dsgas dg 10.0.0.4
df dasa 10.0.0.5
df dag 10.0.0.5
dfd dfdas 10.0.0.5
dfd dfd 10.0.0.6
daf nfd 10.0.0.6
...
as can be seen, that the third field is ip address and sorted. but... (3 Replies)
Hi, I have a huge file which is about 50GB. There are many lines. The file format likes
21 rs885550 0 9887804 C C T C C C C C C C
21 rs210498 0 9928860 0 0 C C 0 0 0 0 0 0
21 rs303304 0 9941889 A A A A A A A A A A
22 rs303304 0 9941890 0 A A A A A A A A A
The question is that there are a few... (4 Replies)
Trying to cut down the size of some log files. Now that I write this out it looks more dificult than i thought it would be.
Need a bash script or command that goes sequentially through all lines of a file, and does this:
if field1 (space separated) is the number 2012 print the entire line. Do... (7 Replies)
Use and complete the template provided. The entire template must be completed. If you don't, your post may be deleted!
1. The problem statement, all variables and given/known data:
You will write a script that will remove all HTML tags from an HTML document and remove any consecutive... (3 Replies)
Hi,
I have a csv file which contains some millions of lines in it.
The first line(Header) repeats at every 50000th line. I want to remove all the duplicate headers from the second occurance(should not remove the first line).
I don't want to use any pattern from the Header as I have some... (7 Replies)
Hi,
In an ideal scenario, I will have a listing of db transaction log that gets copied to a DR site and if I have them all, they will be numbered consecutively like below.
1_79811_01234567.arc
1_79812_01234567.arc
1_79813_01234567.arc
1_79814_01234567.arc
1_79815_01234567.arc... (3 Replies)
Hi All,
I am storing the result in the variable result_text using the below code.
result_text=$(printf "$result_text\t\n$name") The result_text is having the below text. Which is having duplicate lines.
file and time for the interval 03:30 - 03:45
file and time for the interval 03:30 - 03:45 ... (4 Replies)
Hello,
I'm trying to remove the duplicate consecutive lines with specific string "WARNING".
File.txt
abc;
WARNING 2345
WARNING 2345
WARNING 2345
WARNING 2345
WARNING 2345
bcd;
abc;
123
123
123
WARNING 1234
WARNING 2345
WARNING 2345
efgh; (6 Replies)
Discussion started by: Mannu2525
6 Replies
LEARN ABOUT DEBIAN
git-stripspace
GIT-STRIPSPACE(1) Git Manual GIT-STRIPSPACE(1)NAME
git-stripspace - Remove unnecessary whitespace
SYNOPSIS
git stripspace [-s | --strip-comments] < input
DESCRIPTION
Clean the input in the manner used by git for text such as commit messages, notes, tags and branch descriptions.
With no arguments, this will:
o remove trailing whitespace from all lines
o collapse multiple consecutive empty lines into one empty line
o remove empty lines from the beginning and end of the input
o add a missing
to the last line if necessary.
In the case where the input consists entirely of whitespace characters, no output will be produced.
NOTE: This is intended for cleaning metadata, prefer the --whitespace=fix mode of git-apply(1) for correcting whitespace of patches or
files in the repository.
OPTIONS -s, --strip-comments
Skip and remove all lines starting with #.
EXAMPLES
Given the following noisy input with $ indicating the end of a line:
|A brief introduction $
| $
|$
|A new paragraph$
|# with a commented-out line $
|explaining lots of stuff.$
|$
|# An old paragraph, also commented-out. $
| $
|The end.$
| $
Use git stripspace with no arguments to obtain:
|A brief introduction$
|$
|A new paragraph$
|# with a commented-out line$
|explaining lots of stuff.$
|$
|# An old paragraph, also commented-out.$
|$
|The end.$
Use git stripspace --strip-comments to obtain:
|A brief introduction$
|$
|A new paragraph$
|explaining lots of stuff.$
|$
|The end.$
GIT
Part of the git(1) suite
Git 1.7.10.4 11/24/2012 GIT-STRIPSPACE(1)