Extracting the column containing URL from a text file


 
Thread Tools Search this Thread
Top Forums Shell Programming and Scripting Extracting the column containing URL from a text file
# 1  
Old 07-16-2014
Extracting the column containing URL from a text file

I have the file like this:

Timestamp URL Text 1331635241000 http://example.com Peoples footage at www.test.com,http://example4.com 1331635231000 http://example1.net crack the nuts http://example6.com 1331635280000 http://example2.net Loving thisEach column is tab separated. I need to extract only the URLs from column 2 and column 3 if in case of the no URLs then leave it empty for example to get the result like this:

URL Text http://example.com www.test.com,http://example4.com http://example1.net http://example6.com http://example2.net

I tried this script
Code:
awk 'BEGIN {FS="\t"} {print $2,$3}' file | grep -oP '(((http|https|ftp|gopher)|mailto)[.:][^ >"\t]*|www\.[-a-z0-9.]+)[^ .,;\t>">\):]'

This script can give me the all URLS in a single column without the header. Any suggestion to resolve this.
Login or Register to Ask a Question

Previous Thread | Next Thread

10 More Discussions You Might Find Interesting

1. Shell Programming and Scripting

Extracting the column containing URL from a text file

I have the file like this: Timestamp URL Text 1331635241000 http://example.com Peoples footage at www.test.com,http://example4.com 1331635231000 http://example1.net crack the nuts http://example6.com 1331635280000 http://example2.net ... (3 Replies)
Discussion started by: csim_mohan
3 Replies

2. Shell Programming and Scripting

Extracting the column containing URL from a text file

I have the file like this: Timestamp URL Text 1331635241000 http://example.com Peoples footage at www.test.com,http://example4.com 1331635231000 http://example1.net crack the nuts http://example6.com 1331635280000 http://example2.net ... (0 Replies)
Discussion started by: csim_mohan
0 Replies

3. UNIX for Dummies Questions & Answers

Extracting rows from a text file if the value of a column falls between a certain range

Hi, I have a file that looks like the following: 10 100080417 rs7915867 ILMN_1343295 12 6243093 7747537 10 100190264 rs2296431 ILMN_1343295 12 6643093 6647537 10 100719451 SNP94374 ILMN_1343295 12 6688093 7599537 ... (1 Reply)
Discussion started by: evelibertine
1 Replies

4. Shell Programming and Scripting

Extracting the file name from the specified URL

Hello Everyone, I am trying to write a shell script(or Perl Script) that would do the following: I have a file that contains the following lines: File: https://ims-svnus.com/dev/DB/trunk/feeds/templates/shell_script.txt -r860... (5 Replies)
Discussion started by: filter
5 Replies

5. UNIX for Dummies Questions & Answers

Extracting the last column of a text file

I would like to extract the last column of a text file but different rows of the text file have different numbers of columns. How do I go about doing that? Thanks! (1 Reply)
Discussion started by: evelibertine
1 Replies

6. UNIX for Dummies Questions & Answers

Extracting rows from a space delimited text file based on the values of a column

I have a space delimited text file. I want to extract rows where the third column has 0 as a value and write those rows into a new space delimited text file. How do I go about doing that? Thanks! (2 Replies)
Discussion started by: evelibertine
2 Replies

7. UNIX for Dummies Questions & Answers

Extracting rows from a text file based on numerical values of a column

I have a text file where the second column is a list of numbers going from small to large. I want to extract the rows where the second column is smaller than or equal to 0.0001. My input: rs10082730 9e-08 12 46002702 rs2544081 1e-07 12 46015487 rs1425136 1e-06 7 35396742 rs2712590... (1 Reply)
Discussion started by: evelibertine
1 Replies

8. UNIX for Dummies Questions & Answers

Extracting rows from a text file based on the first column

I have a tab delimited text file where the first column can take on three different values : 100, 150, 250. I want to extract all the rows where the first column is 100 and put them into a separate text file and so on. This is what my text file looks like now: 100 rs3794811 0.01 0.3434 100... (1 Reply)
Discussion started by: evelibertine
1 Replies

9. UNIX for Dummies Questions & Answers

Extracting rows from a text file based on the first column

I have a tab delimited text file where the first column can take on three different values : 100, 150, 250. I want to extract all the rows where the first column is 100 and put them into a separate text file and so on. This is what my text file looks like now: 100 rs3794811 0.01 0.3434... (1 Reply)
Discussion started by: evelibertine
1 Replies

10. Shell Programming and Scripting

Extracting anchor text and its URL from HTML files in BASH

Hi All, I have some HTML files and my requirement is to extract all the anchor text words from the HTML files along with their URLs and store the result in a separate text file separated by space. For example, <a href="/kid/stay_healthy/">Staying Healthy</a> which has /kid/stay_healthy/ as... (3 Replies)
Discussion started by: shoaibjameel123
3 Replies
Login or Register to Ask a Question
HTTPFS2(1)																HTTPFS2(1)

NAME
httpfs2 - mount a file from a http server into the filesystem SYNOPSIS
httpfs2 [OPTIONS] URL FUSE-OPTIONS httpfs2_ssl [OPTIONS] URL FUSE-OPTIONS DESCRIPTION
httpfs2 is a FUSE based filesystem for mounting http or https URLS as files in the filesystem. There is no notion of listable directories in http so only a single URL can be mounted. The server must be able to send byte ranges. OPTIONS
-c console Attempt to use the file ior device console for output after fork. The default is /dev/console. -f Do not fork, stay in foreground. -t timeout Use different timeout for connections. Default '30's. URL The url should specify the protocol as http or https, and it may specify basic authentication username and password. Currently special characters like whitespace are not handled so the URL cannot contain them. See a sample URL below: http://user:password@server.com/dir/file FUSE-OPTIONS These options are passed to the FUSE library. At the very least the mount point should be specified. EXIT STATUS
0 Successfully connected to the server other Failure (url parsing error, server error, FUSE setup error). Some FUSE errors may happen only after the process forks so they will not be returned in exit value. BUGS
The process can be stopped by typing ^Z on the terminal which may not be desirable under some circumstances. AUTHORS
Miklos Szeredi <miklos@szeredi.hu> hmb marionraven at users.sourceforge.net Michal Suchanek <hramrach@centrum.cz> COPYING
Free use of this software is granted under the terms of the GNU General Public License (GPL). 03/13/2010 HTTPFS2(1)