Sponsored Content
Full Discussion: Hashing URLs
Top Forums Shell Programming and Scripting Hashing URLs Post 302872493 by twjolson on Friday 8th of November 2013 12:49:22 PM
Old 11-08-2013
Hashing URLs

So, I am writing a script that will read output from Bulk Extractor (which gathers data based on regular expressions). My script then reads the column that has the URL found, hashes it with MD5, then outputs the URL and hash to a file.

Where I am stuck on is that I want to read the bulk extractor output, line by line. I want to take the 2nd column, and add that as a variable named "url". I then want to take that URL, and hash it with MD5 and assign that to the variable "hash".

I then want to output $url and $hash to an output file, both on the same line.

But, when I use :
Code:
while read line
	do
		url=`awk '{print $2}'`
		hash=`awk '{print $2}' | md5sum`
		echo $url
	done < $iname/url.txt

to walk through the input file, line by line, it globs them all together. $url has EVERY URL, and $hash is a hash value of ALL the URLs.

I know this snippet of code works - I've used it before. So, what is going on?

Here is a snippet of the input :
Code:
1691	http://www.pof.com/inbox.aspx_	\007\0000\0001\0002\0003\000(\0004_\020\035http://www.pof.com/inbox.aspx_\0200Online Dating 
1874	http://www.pof.com/sendmessage.aspx	WasHTTPNonGet_\020#http://www.pof.com/sendmessage.aspx[401152043.2\020R\011_
1927	http://www.pof.com/inbox.aspx?messagesent=1&Guid=63929064&SID=dnia5geyks5fjvr2hfesqwbg#in	01152043.2\020R\011_\020Yhttp://www.pof.com/inbox.aspx?messagesent=1&Guid=63929064&SID=dnia5geyks5fjvr2hfesqwbg#in¡\000F\020\014\020\021¬\000H\000\025\000I\000\025
2101	http://www.pof.com/viewallmessages.aspx?sender_id=41561852&message_id=17458470377&Guid=63929064&SID=dnia5geyks5fjvr2hfesqwbg_	7\000C\000D\000E\000G\000J\000M_\020|http://www.pof.com/viewallmessages.aspx?sender_id=41561852&message_id=17458470377&Guid=63929064&SID=dnia5geyks5fjvr2hfesqwbg_\020.POF.com Free O

 

9 More Discussions You Might Find Interesting

1. UNIX for Dummies Questions & Answers

Hashing or MD5

Hi, how can one find that which encryption algorithm the system is using for keeping the user password in the /etc/passwd or /etc/shadow file. Is it 1: Hashing ( which considers only first 5 letters of password) 2: MD5 (Which allows arbitry length passwords) Thanks, ~amit (0 Replies)
Discussion started by: amit4g
0 Replies

2. Shell Programming and Scripting

Perl Hashes, reading and hashing 2 files

So I have two files that I want to put together via hashes and am having a terrible time with syntax. For example: File1 A apple B banana C citrusFile2 A red B yellow C orangeWhat I want to enter on the command line is: program.pl File1 File2And have the result... (11 Replies)
Discussion started by: silkiechicken
11 Replies

3. AIX

How to : Find Which hashing algorithem used in AIX Box ?

hello Friends , How can i identify the hashing algo used by shadow file in aix box >??? Thanks AVKlinux (1 Reply)
Discussion started by: avklinux
1 Replies

4. UNIX for Dummies Questions & Answers

How to analyze file hashing

What command should I use to analyze file hashing of fixed flat files. How much work does it take for multiple flat files. (3 Replies)
Discussion started by: jbjoat
3 Replies

5. UNIX for Dummies Questions & Answers

file hashing utility in unix

I am looking for a utility that does file hashing in unix. ...Please let me know of any good easy to use utility (3 Replies)
Discussion started by: jbjoat
3 Replies

6. Programming

Linear hashing implementation in C language

Hi, I'm looking for linear hashing implementation in C language. Please help. PS: I have implement this on Ubuntu 10.04 Linux on 64 bit machine. (1 Reply)
Discussion started by: sajjar
1 Replies

7. UNIX for Advanced & Expert Users

password hashing algorithms

I'm collecting some info on the password hashing algorithms in use on various Unix systems. So far I have: no $ legacy unix crypt $1$ MD5 $2$ Blowfish on BSD $2a$ alternate Blowfish on BSD $md5$ Sun's alternate MD5 $3$ a Microsoft hash $4$ not used? $5$ RedHat proposed Sha-256... (2 Replies)
Discussion started by: Perderabo
2 Replies

8. Solaris

[solved] Password hashing

Hello, I'm having an issue with my password hashing. In /etc/shadow all the passwords hashes start with $1$. The security people want me to change it so the password hash starts with $5$ or $6$. So this is what I did to fix this. I changed CRYPT_DEFAULT for 1 to 6 CRYPT_DEFAULT=6When I create a... (0 Replies)
Discussion started by: bitlord
0 Replies

9. Solaris

Hashing password with bcrypt in Solaris 10

Hi, Our security audit person generated a report for Solaris-10 servers and mentioned this suggestion - "All passwords should be hashed using bcrypt. Solaris 10 supports this blowfish-based hash algorithm with the identifier 2a. To verify this, ensure the password hashes start with $2a$.... (2 Replies)
Discussion started by: solaris_1977
2 Replies
htdig(1)						      General Commands Manual							  htdig(1)

NAME
htpurge - remove unused documents from the database (general maintenance script) SYNOPSIS
htpurge [-][-a][-c configfile][-u][-v] DESCRIPTION
Htpurge functions to remove specified URLs from the databases as well as bad URLs, unretrieved URLs, obsolete documents, etc. It is recom- mended that htpurge be run after htdig to clean out any documents of this sort. OPTIONS - Take URL list from standard input (rather than specified with -u). Format of input file is one URL per line. -a Use alternate work files. Tells htpurge to append .work to database files, causing a second copy of the database to be built. This allows the original files to be used by htsearch during the run. -c configfile Use the specified configfile instead of the default. -u URL Add this URL to the list of documents to remove. Must be specified multiple times if more than one URL are to be removed. Should nor be used together with -. -v Verbose mode. This increases the verbosity of the program. Using more than 2 is probably only useful for debugging purposes. The default verbose mode (using only one -v) gives a nice progress report while digging. FILES
/etc/htdig/htdig.conf The default configuration file. SEE ALSO
Please refer to the HTML pages (in the htdig-doc package) /usr/share/doc/htdig-doc/html/index.html and the manual pages htdigconfig(8) , htdig(1) and htmerge(1) for a detailed description of ht://Dig and its commands. AUTHOR
This manual page was written by Robert Ribnitz, based on the HTML documentation of ht://Dig. January 2004 htdig(1)
All times are GMT -4. The time now is 02:09 AM.
Unix & Linux Forums Content Copyright 1993-2022. All Rights Reserved.
Privacy Policy