how to write shell script to extract lines we want


 
Thread Tools Search this Thread
Top Forums Shell Programming and Scripting how to write shell script to extract lines we want
# 1  
Old 06-11-2010
how to write shell script to extract lines we want

hi
i have a file which is very large . it contains lines in the format below:

seed url, html url
....
...
seed url, html url

i have sort it already.
2010ÄÏ·ÇÊÀ½ç±*_¾º¼¼·ç±©_ÐÂÀËÍø ÕżªÁúרÀ¸£ºÊÀ½ç±*24ÄêµÄ»ØÒä ÆÚÅÎÑÇÖÞδÀ´ÍŽá_2010ÄÏ·ÇÊÀ½ç±*_¾º¼¼·ç±©_ÐÂÀËÍø
2010ÄÏ·ÇÊÀ½ç±*_¾º¼¼·ç±©_ÐÂÀËÍø ¹úÃ×Óë±´ÄáÌØ˹´ï³ÉÐ*Òé ĪÀ*µÙÐû²¼Ã÷ÈÕÕýʽÉÏÈÎ_ÁªÈü·éÑÌ_2010ÄÏ·ÇÊÀ½ç±*_¾º¼¼·ç±©_ÐÂÀËÍø
2010ÄÏ·ÇÊÀ½ç±*_¾º¼¼·ç±©_ÐÂÀËÍø ר·Ã¹þÂü£ºÑÇÖÞÊÀ½ç±*Ãû¶î²»»á±ä ÖÁÉÙÁ½¶Ó½ø16Ç¿__2010ÄÏ·ÇÊÀ½ç±*_¾º¼¼·ç±©_ÐÂÀËÍø
2010ÄÏ·ÇÊÀ½ç±*_¾º¼¼·ç±©_ÐÂÀËÍø ¿¨Î÷ÐÔ¸ÐÅ®ÓѲ»¿´ºÃÎ÷°àÑÀ¶á¹Ú Ïëµ½Öйú¿´´óÐÜè(ͼ)_H×é_2010ÄÏ·ÇÊÀ½ç±*_¾º¼¼·ç±©_ÐÂÀËÍø
2010ÄÏ·ÇÊÀ½ç±*_¾º¼¼·ç±©_ÐÂÀËÍø »Æ½¡Ï裺ÎÒÓмÛÎÞÊÐ ÊÀ½ç±*ÐÂÕ½ÏßÔÚ¡¶»Æ¼ÓÀîÅÝ¡·__2010ÄÏ·ÇÊÀ½ç±*_¾º¼¼·ç±©_ÐÂÀËÍø
º¼ÖÝ·¿µØ²úÐÂÎÅÃÅ»§_סÔÚº¼ÖÝÍø_Õã½*×î¾ßÓ°ÏìÁ¦·¿²úÍøÂçýÌå_º¼ÖÝ·¿²úÍø,º¼ÖÝз¿,º¼ÖÝÂ¥ÅÌ,º¼ÖÝÂò·¿,º¼ÖÝ× â·¿,ÍÁµØ³öÈÃ,º¼ÖݶþÊÖ·¿Ê×Ñ¡ 70´óÖгÇÊÐ ½ö8¸ö·¿¼Û»·±Èϵø-·¿¼Û-Õã½*ÔÚÏß-סÔÚº¼ÖÝ
º¼ÖÝ·¿µØ²úÐÂÎÅÃÅ»§_סÔÚº¼ÖÝÍø_Õã½*×î¾ßÓ°ÏìÁ¦·¿²úÍøÂçýÌå_º¼ÖÝ·¿²úÍø,º¼ÖÝз¿,º¼ÖÝÂ¥ÅÌ,º¼ÖÝÂò·¿,º¼ÖÝ× â·¿,ÍÁµØ³öÈÃ,º¼ÖݶþÊÖ·¿Ê×Ñ¡ 2010Äê6ÔÂ11ÈÕ¡¶×¡º¼Óʱ¨¡·-Óʱ¨-Õã½*ÔÚÏß-סÔÚº¼ÖÝ
º¼ÖÝ·¿µØ²úÐÂÎÅÃÅ»§_סÔÚº¼ÖÝÍø_Õã½*×î¾ßÓ°ÏìÁ¦·¿²úÍøÂçýÌå_º¼ÖÝ·¿²úÍø,º¼ÖÝз¿,º¼ÖÝÂ¥ÅÌ,º¼ÖÝÂò·¿,º¼ÖÝ× â·¿,ÍÁµØ³öÈÃ,º¼ÖݶþÊÖ·¿Ê×Ñ¡ ×â·¿Öнé·Ñ"Âú500¼õ200" Êг¡ÖÊÒÉ·¿²úÖнé´ÙÏúÊÖ¶Î-Öнé·Ñ,×â·¿-Õã½*ÔÚÏß-סÔÚº¼ÖÝ
º¼ÖÝ·¿µØ²úÐÂÎÅÃÅ»§_סÔÚº¼ÖÝÍø_Õã½*×î¾ßÓ°ÏìÁ¦·¿²úÍøÂçýÌå_º¼ÖÝ·¿²úÍø,º¼ÖÝз¿,º¼ÖÝÂ¥ÅÌ,º¼ÖÝÂò·¿,º¼ÖÝ× â·¿,ÍÁµØ³öÈÃ,º¼ÖݶþÊÖ·¿Ê×Ñ¡ 5Ô¾*¼ÃÊý¾Ý½ñ¹«²¼ ͨÕÍÒþÓÇÈÔ´ó¼ÓÏ¢»òÔٴηŻº--Õã½*ÔÚÏß-סÔÚº¼ÖÝ
º¼ÖÝ·¿µØ²úÐÂÎÅÃÅ»§_סÔÚº¼ÖÝÍø_Õã½*×î¾ßÓ°ÏìÁ¦·¿²úÍøÂçýÌå_º¼ÖÝ·¿²úÍø,º¼ÖÝз¿,º¼ÖÝÂ¥ÅÌ,º¼ÖÝÂò·¿,º¼ÖÝ× â·¿,ÍÁµØ³öÈÃ,º¼ÖݶþÊÖ·¿Ê×Ñ¡ ÏúÊÛÓöÀäÍÁµØÁ÷ÅÄ ²úȨʽ¾Æµê½ûÊÛÁîÏÔÍþ -¾Æµêʽ¹«Ô¢,ÍÁµØ-Õã½*ÔÚÏß-סÔÚº¼ÖÝ
º¼ÖÝ·¿µØ²úÐÂÎÅÃÅ»§_סÔÚº¼ÖÝÍø_Õã½*×î¾ßÓ°ÏìÁ¦·¿²úÍøÂçýÌå_º¼ÖÝ·¿²úÍø,º¼ÖÝз¿,º¼ÖÝÂ¥ÅÌ,º¼ÖÝÂò·¿,º¼ÖÝ× â·¿,ÍÁµØ³öÈÃ,º¼ÖݶþÊÖ·¿Ê×Ñ¡ 5Ô·ݾ*¼ÃÊý¾Ý¹«²¼ ¾ÓÃñÏû·Ñ¼Û¸ñͬ±ÈÉÏÕÇ3.1%--Õã½*ÔÚÏß-סÔÚº¼ÖÝ

now i want to get 3 htmlurl for each seedurl.
any tips will be appreciated.

---------- Post updated at 11:15 AM ---------- Previous update was at 11:11 AM ----------

hi
i have a file which is very large . it contains lines in the format below:

seedurl1, htmlur1
seedurl1, htmlurl2
....
seedurl1,htmlurln
.....
seedurlm,htmlurl1
seedurlm,htmlurl2
.....
seedurlm,htmlurln
......

now i want to get 3 htmlurl3 for each seedurl.
any tips will be appreciated.
# 2  
Old 06-11-2010
Your first post can't be read.

For your second post, do you want to get the output as below?

Code:
seedurl1, htmlur1, htmlurl2, htmlurl3 (the first 3 urls for each seedurl)?
...
seedurlm, htmlur1, htmlurl2, htmlurl3

# 3  
Old 06-11-2010
for the second,i want to get the output as:
seedurl1,htmlurl1
seedurl1,htmlurl2
seedurl1,htmlurl3
.....
seedurlm,htmlurl1
seedurlm,htmlurl2
seedurlm,htmlurl3
....


thanks,any idea
# 4  
Old 06-11-2010
your first post can't be read, all of them are converted to http links automatically. I guess you need wrap CODE tags around the input file.
# 5  
Old 06-11-2010
you can treat first post and second post as the same, and ignore the first.
just give tips about the second.
thanks
# 6  
Old 06-11-2010
Should have better solution.

Code:
$ cat urfile
seedurl1,htmlurl1
seedurl1,htmlurl2
seedurl1,htmlurl3
seedurl1,htmlurl4
seedurl1,htmlurl5
seedurl1,htmlurl6
seedurlm,htmlurl1
seedurlm,htmlurl2
seedurlm,htmlurl3
seedurlm,htmlurl4
seedurlm,htmlurl5

$ awk -F , '{a[$1]=a[$1] FS $2}
            END {for (i in a) {split(a[i],b,","); printf "%s,%s\n%s,%s\n%s,%s\n",i,b[2],i,b[3],i,b[4]}} ' urfile
seedurlm,htmlurl1
seedurlm,htmlurl2
seedurlm,htmlurl3
seedurl1,htmlurl1
seedurl1,htmlurl2
seedurl1,htmlurl3

With your real data:

Code:
$ cat urfile1
http://2010.sina.com.cn,http://2010.sina.com.cn/2010-06-09/01528724.shtml
http://2010.sina.com.cn,http://2010.sina.com.cn/2010-06-09/03238769.shtml
http://2010.sina.com.cn,http://2010.sina.com.cn/2010-06-09/04448785.shtml
http://2010.sina.com.cn,http://2010.sina.com.cn/2010-06-09/05328842.shtml
http://2010.sina.com.cn,http://2010.sina.com.cn/2010-06-09/13359200.shtml
http://zzhz.zjol.com.cn,http://zzhz.zjol.com.cn/05zzhz/system/2010/06/10/016678515.shtml
http://zzhz.zjol.com.cn,http://zzhz.zjol.com.cn/05zzhz/system/2010/06/11/016678967.shtml
http://zzhz.zjol.com.cn,http://zzhz.zjol.com.cn/05zzhz/system/2010/06/11/016679056.shtml
http://zzhz.zjol.com.cn,http://zzhz.zjol.com.cn/05zzhz/system/2010/06/11/016679169.shtml
http://zzhz.zjol.com.cn,http://zzhz.zjol.com.cn/05zzhz/system/2010/06/11/016679553.shtml
http://zzhz.zjol.com.cn,http://zzhz.zjol.com.cn/05zzhz/system/2010/06/11/016679707.shtml

$ awk -F , '{a[$1]=a[$1] FS $2}
            END {for (i in a) {split(a[i],b,","); printf "%s,%s\n%s,%s\n%s,%s\n",i,b[2],i,b[3],i,b[4]}} ' urfile1
http://zzhz.zjol.com.cn,http://zzhz.zjol.com.cn/05zzhz/system/2010/06/10/016678515.shtml
http://zzhz.zjol.com.cn,http://zzhz.zjol.com.cn/05zzhz/system/2010/06/11/016678967.shtml
http://zzhz.zjol.com.cn,http://zzhz.zjol.com.cn/05zzhz/system/2010/06/11/016679056.shtml
http://2010.sina.com.cn,http://2010.sina.com.cn/2010-06-09/01528724.shtml
http://2010.sina.com.cn,http://2010.sina.com.cn/2010-06-09/03238769.shtml
http://2010.sina.com.cn,http://2010.sina.com.cn/2010-06-09/04448785.shtml


Last edited by rdcwayx; 06-11-2010 at 01:23 AM..
# 7  
Old 06-11-2010
thanks
Login or Register to Ask a Question

Previous Thread | Next Thread

10 More Discussions You Might Find Interesting

1. Shell Programming and Scripting

How to write config shell script to pass variables in master shell script?

Dear Unix gurus, We have a config shell script file which has 30 variables which needs to be passed to master unix shell script that invokes oracle database sessions. So those 30 variables need to go through the database sessions (They are inputs) via a shell script. one of the variable name... (1 Reply)
Discussion started by: dba1981
1 Replies

2. UNIX for Dummies Questions & Answers

How to write Config shell script to pass variables in master shell script?

Dear Unix gurus, We have a config shell script file which has 30 variables which needs to be passed to master unix shell script that invokes oracle database sessions. So those 30 variables need to go through the database sessions (They are inputs) via a shell script. one of the variable name... (1 Reply)
Discussion started by: dba1981
1 Replies

3. Shell Programming and Scripting

Help with Shell Script to identify lines in file1 and write them to file2

Hi, I am running my pipeline and capturing all stout from multiple programs to a .txt file. I want to go into that .txt file and search for specific lines, and finally print those lines in a second .txt file. I can do this using grep, awk, or sed for each line, but have not been able to get... (2 Replies)
Discussion started by: hmortens
2 Replies

4. Shell Programming and Scripting

how to write bash script that will automatically extract zip file

i'm trying to write a bash script that that will automatically extract zip files after the download. i writed this script #!/bin/bash wget -c https://github.com/RonGokhle/kernel-downloader/zipball/master CURRENDIR=/home/kernel-downloader cd $CURRENDIR rm $CURRENDIR/zipfiles 2>/dev/null ... (2 Replies)
Discussion started by: ron gokhle
2 Replies

5. Shell Programming and Scripting

how to write multiple lines to a file using shell script?

I need to create an xml using shell script, but i first want to know how can i write multiple lines to file using shell script? (7 Replies)
Discussion started by: vel4ever
7 Replies

6. Shell Programming and Scripting

How to write a script to extract strings from a file.

Hello fourm members, I want to write a script to extarct paticular strings from the all type of files(.sh files,logfiles,txtfiles) and redirect into a log file. example: I have to find the line below in the script and extract the uname and Pwds. sqsh -scia2007 -DD0011uw01 -uciadev... (5 Replies)
Discussion started by: rajkumar_g
5 Replies

7. UNIX for Dummies Questions & Answers

Write a script to extract information from a db

Hi I need to put together a script that will search certain tables in a db and send that data to a csv file. Basically I am importing data to a db and I want to write a script to check that all information was imported correctly. Thank you (1 Reply)
Discussion started by: ladyAnne
1 Replies

8. UNIX for Dummies Questions & Answers

Script to Extract time from log files and write to a excel

Can someone help me with writing a unix script for following requirement 1) I have a log file in which we have start time and end time (format: hh:mm:ss) Example: starting script on Thu Jun 5 20:50:52 --------- Thu Jun 5 21:55:33 - Script Completed 2) I want to extract... (4 Replies)
Discussion started by: santosham
4 Replies

9. Shell Programming and Scripting

Script to Extract time from log files and write to a excel

Can someone help me with writing a unix script for following requirement 1) I have a log file in which we have start time and end time (format: hh:mm:ss) Example: starting script on Thu Jun 5 20:50:52 Thu Jun 5 21:55:33 - Script Completed 2) I want to extract start time and end time of... (0 Replies)
Discussion started by: santosham
0 Replies

10. UNIX for Advanced & Expert Users

Script to Extract time from log files and write to a excel

Can someone help me with writing a unix script for following requirement 1) I have a log file in which we have start time and end time (format: hh:mm:ss) Example: starting script on Thu Jun 5 20:50:52 Thu Jun 5 21:55:33 - Script Completed 2) I want to extract start time and end time of... (0 Replies)
Discussion started by: santosham
0 Replies
Login or Register to Ask a Question