09-16-2011
I did ask if the text was always as shown; apparently not. This is why xml is so hard to awk...
Something like that would've been my suggestion to fix it anyway, though
I don't understand how that string would cause awk to mess up, though! Can you show the actual XML surrounding it?
10 More Discussions You Might Find Interesting
1. Shell Programming and Scripting
Hi
I am writing a script which should read a file and search for certain strings 'approved' or 'removed' and retain only those lines that contain the above strings.
Ex: file name 'test'
test:
approved package
waiting for approval package
disapproved package
removed package
approved... (14 Replies)
Discussion started by: vj8436
14 Replies
2. UNIX for Dummies Questions & Answers
Hey all, a relative bash/script newbie trying solve a problem.
I've got a text file with lots of lines that I've been able to clean up and format with awk/sed/cut, but now I'd like to remove the lines with duplicate usernames based on time stamp. Here's what the data looks like
2007-11-03... (3 Replies)
Discussion started by: mattv
3 Replies
3. UNIX for Dummies Questions & Answers
Hello guys,
should be a very easy questn for you:
I need to delete strings in file1 based on the list of strings in file2.
like file2:
word1_word2_
word3_word5_
word3_word4_
word6_word7_
file1:
word1_word2_otherwords..,word3_word5_others... (7 Replies)
Discussion started by: roussine
7 Replies
4. Shell Programming and Scripting
Platform : RHEL 5.8
I have text file called myapplication.log . In this file, I have around 800 lines which start with the followng three strings
PWRBRKER-3493
PWRBRKER-7834
SCHEDULER-ERROR
How can I delete these lines in one go ? (13 Replies)
Discussion started by: omega3
13 Replies
5. Shell Programming and Scripting
if I have the following lines in a file app.log
some lines here
<AAAA>
abc
<id>123456789</id>
ddd
</AAAA>some lines here too
<BBBB>
abc
<id>123456789</id>
ddd
</BBBB>some lines here too
<AAAA>
xyz
<id>987654321</id>
ssss
</AAAA>some lines here again...
How do I get the... (5 Replies)
Discussion started by: nariwithu
5 Replies
6. Shell Programming and Scripting
Hi,
i need help to remove duplicates in my file. The problem is i need to delete one duplicate for each line only. the input file as follows and it is not tab delimited:-
The output need to remove 2nd word (in red) that duplicate with 1st word (in blue). Other duplicates should remained... (12 Replies)
Discussion started by: redse171
12 Replies
7. UNIX for Dummies Questions & Answers
I want to replace strings in test2 according to test1 table. In doing so, I`m losing records that I dont need to replace, please suggest modifications.
what i have
$ cat > test1
a b
c d
$ cat > test2
a
a
a
d
d
what i tried
$ awk ' BEGIN {FS=OFS=" "} FNR==NR{a=$2;next}... (2 Replies)
Discussion started by: senhia83
2 Replies
8. Shell Programming and Scripting
Within my text file i have several thousand lines of text with some lines containing duplicate strings/words. I would like to entirely remove those lines which contain the duplicate strings.
Eg;
One and a Two
Unix.com is the Best
This as a Line Line
Example duplicate sentence with the word... (22 Replies)
Discussion started by: martinsmith
22 Replies
9. UNIX for Beginners Questions & Answers
Hello Everyone ,
Iam a newbie to shell programming and iam reaching out if anyone can help in this :-
I have two files
1) Insert.txt
2) partition_list.txt
insert.txt looks like this :-
insert into emp1 partition (partition_name)
(a1,
b2,
c4,
s6,
d8)
select
a1,
b2,
c4, (2 Replies)
Discussion started by: nubie2linux
2 Replies
10. UNIX for Beginners Questions & Answers
I cannot seem to get what should be a simple awk one-liner to work correctly and cannot figure out why. I would like to use patterns from a specific field in one file as regex to search for matching strings in the entire line ($0) of another file.
I would like to output the lines of File2 which... (1 Reply)
Discussion started by: jvoot
1 Replies
LEARN ABOUT DEBIAN
bup-margin
bup-margin(1) General Commands Manual bup-margin(1)
NAME
bup-margin - figure out your deduplication safety margin
SYNOPSIS
bup margin [options...]
DESCRIPTION
bup margin iterates through all objects in your bup repository, calculating the largest number of prefix bits shared between any two
entries. This number, n, identifies the longest subset of SHA-1 you could use and still encounter a collision between your object ids.
For example, one system that was tested had a collection of 11 million objects (70 GB), and bup margin returned 45. That means a 46-bit
hash would be sufficient to avoid all collisions among that set of objects; each object in that repository could be uniquely identified by
its first 46 bits.
The number of bits needed seems to increase by about 1 or 2 for every doubling of the number of objects. Since SHA-1 hashes have 160 bits,
that leaves 115 bits of margin. Of course, because SHA-1 hashes are essentially random, it's theoretically possible to use many more bits
with far fewer objects.
If you're paranoid about the possibility of SHA-1 collisions, you can monitor your repository by running bup margin occasionally to see if
you're getting dangerously close to 160 bits.
OPTIONS
--predict
Guess the offset into each index file where a particular object will appear, and report the maximum deviation of the correct answer
from the guess. This is potentially useful for tuning an interpolation search algorithm.
--ignore-midx
don't use .midx files, use only .idx files. This is only really useful when used with --predict.
EXAMPLE
$ bup margin
Reading indexes: 100.00% (1612581/1612581), done.
40
40 matching prefix bits
1.94 bits per doubling
120 bits (61.86 doublings) remaining
4.19338e+18 times larger is possible
Everyone on earth could have 625878182 data sets
like yours, all in one repository, and we would
expect 1 object collision.
$ bup margin --predict
PackIdxList: using 1 index.
Reading indexes: 100.00% (1612581/1612581), done.
915 of 1612581 (0.057%)
SEE ALSO
bup-midx(1), bup-save(1)
BUP
Part of the bup(1) suite.
AUTHORS
Avery Pennarun <apenwarr@gmail.com>.
Bup unknown- bup-margin(1)