Sponsored Content
Top Forums Shell Programming and Scripting "links -dump" output format issue Post 302537128 by Skrynesaver on Thursday 7th of July 2011 06:50:38 AM
Old 07-07-2011
The following should work, however it will break the std width formatting of links -dump
Code:
links -dump <Your HTML File> |perl -e '
$on_page=1;
while(<STDIN>){
   $on_page=0 if $on_page && /^References$/;
   push @output,$_ if $on_page;
   $links{$1}=$2 if $in_links && /^\s+(\d+)\.\s(.+)$/;
   $in_links=1 if (!$on_page && /^\s+Visible links$/)
}
for (@output){
   s/\[(\d+)\]/[$links{$1}]/g;
   print
}'


Last edited by Skrynesaver; 07-07-2011 at 11:26 AM.. Reason: tidied code
This User Gave Thanks to Skrynesaver For This Post:
 

10 More Discussions You Might Find Interesting

1. Debian

Debian: doubt in "top" %CPU and "sar" output

Hi All, I am running my application on a dual cpu debian linux 3.0 (2.4.19 kernel). For my application: <sar -U ALL> CPU %user %nice %system %idle ... 10:58:04 0 153.10 0.00 38.76 0.00 10:58:04 1 3.88 0.00 4.26 ... (0 Replies)
Discussion started by: jaduks
0 Replies

2. Solaris

significance of "+" char in SunOS "ls -l" output

Hi, I've noticed that the permissions output from "ls -l" under SunOS differs from Linux in that after the "rwxrwxrwx" field, there is an additional "+" character that may or may not be there. What is the significance of this character? Thanks, Suan (6 Replies)
Discussion started by: sayeo
6 Replies

3. UNIX for Dummies Questions & Answers

Explanation of "total" field in "ls -l" command output

When I do a listing in one particular directory (ls -al) I get: total 43456 drwxrwxrwx 2 root root 4096 drwxrwxrwx 3 root root 4096 -rwxrwxr-x 1 nobody nobody 3701594 -rwxrwxr-x 1 nobody nobody 3108510 -rwxrwxr-x 1 nobody nobody 3070580 -rwxrwxr-x 1 nobody nobody 3099733 -rwxrwxr-x 1... (1 Reply)
Discussion started by: proactiveaditya
1 Replies

4. Shell Programming and Scripting

"Join" or "Merge" more than 2 files into single output based on common key (column)

Hi All, I have working (Perl) code to combine 2 input files into a single output file using the join function that works to a point, but has the following limitations: 1. I am restrained to 2 input files only. 2. Only the "matched" fields are written out to the "matched" output file and... (1 Reply)
Discussion started by: Katabatic
1 Replies

5. UNIX for Dummies Questions & Answers

Format output from "echo" command

Hi, I have written a BASH shell script that contains a lot of "echo" commands to notify the user about what's going on. The script generates a log file that contains a copy of what is seen in the terminal. The echo statements are generally verbose, and thus extend out for quite a ways on one... (2 Replies)
Discussion started by: msb65
2 Replies

6. Shell Programming and Scripting

Format output from the file to extract "date" section

Hello Team , I have to extract date section from the below file output. The output of the file is as shown below. I have to extract the "" this section from the above output of the file. can anyone please let me know how can we acheive this? (4 Replies)
Discussion started by: coolguyamy
4 Replies

7. Shell Programming and Scripting

Amend format of output from "last"command

When I run a “last” command, I am getting more than one row per user. Eg userabc pts/2 10.253.0.108 Tue Dec 14 09:21 still logged in userabc pts/2 10.253.0.108 Tue Dec 14 03:57 - 03:59 (00:01) userabc pts/2 10.253.0.108 Mon Dec 13 14:25 - 15:43 (01:18) userpqr pts/3 10.253.0.39 Wed Jan... (5 Replies)
Discussion started by: malts18
5 Replies

8. Shell Programming and Scripting

awk command to replace ";" with "|" and ""|" at diferent places in line of file

Hi, I have line in input file as below: 3G_CENTRAL;INDONESIA_(M)_TELKOMSEL;SPECIAL_WORLD_GRP_7_FA_2_TELKOMSEL My expected output for line in the file must be : "1-Radon1-cMOC_deg"|"LDIndex"|"3G_CENTRAL|INDONESIA_(M)_TELKOMSEL"|LAST|"SPECIAL_WORLD_GRP_7_FA_2_TELKOMSEL" Can someone... (7 Replies)
Discussion started by: shis100
7 Replies

9. Shell Programming and Scripting

""Help Me!""Beginner awk learning issue

Hi All, I have just now started learning awk from the source - Awk - A Tutorial and Introduction - by Bruce Barnett and the bad part is that I am stuck on the very first example for running the awk script. The script is as - #!/bin/sh # Linux users have to change $8 to $9 awk ' BEGIN ... (6 Replies)
Discussion started by: csrohit
6 Replies

10. Shell Programming and Scripting

Bash script - Print an ascii file using specific font "Latin Modern Mono 12" "regular" "9"

Hello. System : opensuse leap 42.3 I have a bash script that build a text file. I would like the last command doing : print_cmd -o page-left=43 -o page-right=22 -o page-top=28 -o page-bottom=43 -o font=LatinModernMono12:regular:9 some_file.txt where : print_cmd ::= some printing... (1 Reply)
Discussion started by: jcdole
1 Replies
bup-margin(1)						      General Commands Manual						     bup-margin(1)

NAME
bup-margin - figure out your deduplication safety margin SYNOPSIS
bup margin [options...] DESCRIPTION
bup margin iterates through all objects in your bup repository, calculating the largest number of prefix bits shared between any two entries. This number, n, identifies the longest subset of SHA-1 you could use and still encounter a collision between your object ids. For example, one system that was tested had a collection of 11 million objects (70 GB), and bup margin returned 45. That means a 46-bit hash would be sufficient to avoid all collisions among that set of objects; each object in that repository could be uniquely identified by its first 46 bits. The number of bits needed seems to increase by about 1 or 2 for every doubling of the number of objects. Since SHA-1 hashes have 160 bits, that leaves 115 bits of margin. Of course, because SHA-1 hashes are essentially random, it's theoretically possible to use many more bits with far fewer objects. If you're paranoid about the possibility of SHA-1 collisions, you can monitor your repository by running bup margin occasionally to see if you're getting dangerously close to 160 bits. OPTIONS
--predict Guess the offset into each index file where a particular object will appear, and report the maximum deviation of the correct answer from the guess. This is potentially useful for tuning an interpolation search algorithm. --ignore-midx don't use .midx files, use only .idx files. This is only really useful when used with --predict. EXAMPLE
$ bup margin Reading indexes: 100.00% (1612581/1612581), done. 40 40 matching prefix bits 1.94 bits per doubling 120 bits (61.86 doublings) remaining 4.19338e+18 times larger is possible Everyone on earth could have 625878182 data sets like yours, all in one repository, and we would expect 1 object collision. $ bup margin --predict PackIdxList: using 1 index. Reading indexes: 100.00% (1612581/1612581), done. 915 of 1612581 (0.057%) SEE ALSO
bup-midx(1), bup-save(1) BUP
Part of the bup(1) suite. AUTHORS
Avery Pennarun <apenwarr@gmail.com>. Bup unknown- bup-margin(1)
All times are GMT -4. The time now is 06:53 PM.
Unix & Linux Forums Content Copyright 1993-2022. All Rights Reserved.
Privacy Policy