Hashing URLs


 
Thread Tools Search this Thread
Top Forums Shell Programming and Scripting Hashing URLs
# 1  
Old 11-08-2013
Hashing URLs

So, I am writing a script that will read output from Bulk Extractor (which gathers data based on regular expressions). My script then reads the column that has the URL found, hashes it with MD5, then outputs the URL and hash to a file.

Where I am stuck on is that I want to read the bulk extractor output, line by line. I want to take the 2nd column, and add that as a variable named "url". I then want to take that URL, and hash it with MD5 and assign that to the variable "hash".

I then want to output $url and $hash to an output file, both on the same line.

But, when I use :
Code:
while read line
	do
		url=`awk '{print $2}'`
		hash=`awk '{print $2}' | md5sum`
		echo $url
	done < $iname/url.txt

to walk through the input file, line by line, it globs them all together. $url has EVERY URL, and $hash is a hash value of ALL the URLs.

I know this snippet of code works - I've used it before. So, what is going on?

Here is a snippet of the input :
Code:
1691	http://www.pof.com/inbox.aspx_	\007\0000\0001\0002\0003\000(\0004_\020\035http://www.pof.com/inbox.aspx_\0200Online Dating 
1874	http://www.pof.com/sendmessage.aspx	WasHTTPNonGet_\020#http://www.pof.com/sendmessage.aspx[401152043.2\020R\011_
1927	http://www.pof.com/inbox.aspx?messagesent=1&Guid=63929064&SID=dnia5geyks5fjvr2hfesqwbg#in	01152043.2\020R\011_\020Yhttp://www.pof.com/inbox.aspx?messagesent=1&Guid=63929064&SID=dnia5geyks5fjvr2hfesqwbg#in¡\000F\020\014\020\021¬\000H\000\025\000I\000\025
2101	http://www.pof.com/viewallmessages.aspx?sender_id=41561852&message_id=17458470377&Guid=63929064&SID=dnia5geyks5fjvr2hfesqwbg_	7\000C\000D\000E\000G\000J\000M_\020|http://www.pof.com/viewallmessages.aspx?sender_id=41561852&message_id=17458470377&Guid=63929064&SID=dnia5geyks5fjvr2hfesqwbg_\020.POF.com Free O

# 2  
Old 11-08-2013
Quote:
I know this snippet of code works - I've used it before.
I doubt it:
iname is undefined.
$2 is undefined (or, at least, has nothing to do with your input file).
hash is NOT output to anywhere.
url and hash, in your snippet, won't assemble values after value.
An output file is not defined.
And the input file looks extremely garbled.

Quote:
So, what is going on?
I guess, what you told the code to do...
# 3  
Old 11-08-2013
Quote:
Originally Posted by RudiC
I doubt it:
iname is undefined.
$2 is undefined (or, at least, has nothing to do with your input file).
hash is NOT output to anywhere.
url and hash, in your snippet, won't assemble values after value.
An output file is not defined.
And the input file looks extremely garbled.

I guess, what you told the code to do...
Like I said, it is a snippet, even more so, it is a snippet of a work in progress. So, all the undefined things - are. And the outputs you say are missing, will be put in place later.

My question was the while structure and why it is globbing data instead of stepping through line by line. Not whether or not variables are defined.

And by "extremely garbled", you mean a tab separated value text document, then yes, it is "extremely garbled".

Moderator's Comments:
Mod Comment Neo - Deleted inappropriate personal comment.
# 4  
Old 11-08-2013
I don't care what kind of snippet this is, RudiC is correct. This snippet can't work (even if the variables are defined to reasonable values). You have two invocations of awk with no input specified. Presumably the 1st one will gobble up all of the data remaining in $iname/url.txt after the read in the while loop grabbed the 1st line. Then the 2nd awk will immediately hit end-of-file.

You came here asking for help.
You insulted the person who pointed out that the data you supplied was incomplete (thereby making analysis difficult).
This is a great way to discourage readers in this forum who might consider trying to help solve your problem from posting any other responses.

Last edited by Don Cragun; 11-08-2013 at 02:50 PM.. Reason: Fix typo
# 5  
Old 11-08-2013
Quote:
Originally Posted by twjolson
Like I said, it is a snippet, even more so, it is a snippet of a work in progress. So, all the undefined things - are. And the outputs you say are missing, will be put in place later.

My question was the while structure and why it is globbing data instead of stepping through line by line. Not whether or not variables are defined.

And by "extremely garbled", you mean a tab separated value text document, then yes, it is "extremely garbled".

Thanks for nothing, idiot.
What is this ? do you know the rules of this forum

(1) No flames, shouting (all caps), sarcasm, bullying, profanity or arrogant posts.

Here people are spending there time to help others, and they are not getting salary for that, if you need help in future in this forum behave gently. You should get infraction for this. And I know RudiC, without meaning he never answer/reply. Really I personally felt very bad that you are insulting such a great Advisor who spends his time for others. Don't repeat this again.

Akshay Hegde

Last edited by Akshay Hegde; 11-08-2013 at 03:00 PM..
This User Gave Thanks to Akshay Hegde For This Post:
# 6  
Old 11-08-2013
Quote:
Originally Posted by Don Cragun
I don't care what kind of snippet this is, RudiC is correct. This snippet can't work (even if the variables are defined to reasonable values). You have two invocations of awk with no input specified. Presumably the 1st one will gobble up all of the data remaining in $iname/url.txt after the read in the while loop grabbed the 1st line. Then the 2nd awk will immediately hit end-of-file.

You came here asking for help.
You insulted the person who pointed out that the data you supplied was incomplete (thereby making analysis difficult).
This is a great way to discourage readers in this forum who might consider trying to help solve your problem from posting any other responses.
I did come here for help, he wasn't helpful, and was insulting.

I get what you're saying about 1 awk instance gobbling up all the incoming data. Thank you, that is helpful. I guess I am at a loss on how to proceed. I need to take the data from a line, and do two things with it. Any suggestions for how to proceed?

The input is supplied by < $iname/url.txt, is it not? I mean the while loop is looping through something?
# 7  
Old 11-09-2013
i would use bash to split the file first
Code:
while IFS=$'\t' read -r id url _; do
    hash=$(printf '%s' "$url" | md5sum)
    echo "$hash $url"
done < "$iname"/url.txt

Login or Register to Ask a Question

Previous Thread | Next Thread

9 More Discussions You Might Find Interesting

1. Solaris

Hashing password with bcrypt in Solaris 10

Hi, Our security audit person generated a report for Solaris-10 servers and mentioned this suggestion - "All passwords should be hashed using bcrypt. Solaris 10 supports this blowfish-based hash algorithm with the identifier 2a. To verify this, ensure the password hashes start with $2a$.... (2 Replies)
Discussion started by: solaris_1977
2 Replies

2. Solaris

[solved] Password hashing

Hello, I'm having an issue with my password hashing. In /etc/shadow all the passwords hashes start with $1$. The security people want me to change it so the password hash starts with $5$ or $6$. So this is what I did to fix this. I changed CRYPT_DEFAULT for 1 to 6 CRYPT_DEFAULT=6When I create a... (0 Replies)
Discussion started by: bitlord
0 Replies

3. UNIX for Advanced & Expert Users

password hashing algorithms

I'm collecting some info on the password hashing algorithms in use on various Unix systems. So far I have: no $ legacy unix crypt $1$ MD5 $2$ Blowfish on BSD $2a$ alternate Blowfish on BSD $md5$ Sun's alternate MD5 $3$ a Microsoft hash $4$ not used? $5$ RedHat proposed Sha-256... (2 Replies)
Discussion started by: Perderabo
2 Replies

4. Programming

Linear hashing implementation in C language

Hi, I'm looking for linear hashing implementation in C language. Please help. PS: I have implement this on Ubuntu 10.04 Linux on 64 bit machine. (1 Reply)
Discussion started by: sajjar
1 Replies

5. UNIX for Dummies Questions & Answers

file hashing utility in unix

I am looking for a utility that does file hashing in unix. ...Please let me know of any good easy to use utility (3 Replies)
Discussion started by: jbjoat
3 Replies

6. UNIX for Dummies Questions & Answers

How to analyze file hashing

What command should I use to analyze file hashing of fixed flat files. How much work does it take for multiple flat files. (3 Replies)
Discussion started by: jbjoat
3 Replies

7. AIX

How to : Find Which hashing algorithem used in AIX Box ?

hello Friends , How can i identify the hashing algo used by shadow file in aix box >??? Thanks AVKlinux (1 Reply)
Discussion started by: avklinux
1 Replies

8. Shell Programming and Scripting

Perl Hashes, reading and hashing 2 files

So I have two files that I want to put together via hashes and am having a terrible time with syntax. For example: File1 A apple B banana C citrusFile2 A red B yellow C orangeWhat I want to enter on the command line is: program.pl File1 File2And have the result... (11 Replies)
Discussion started by: silkiechicken
11 Replies

9. UNIX for Dummies Questions & Answers

Hashing or MD5

Hi, how can one find that which encryption algorithm the system is using for keeping the user password in the /etc/passwd or /etc/shadow file. Is it 1: Hashing ( which considers only first 5 letters of password) 2: MD5 (Which allows arbitry length passwords) Thanks, ~amit (0 Replies)
Discussion started by: amit4g
0 Replies
Login or Register to Ask a Question