Sponsored Content
Top Forums Shell Programming and Scripting text transformation with sed or awk Post 302319745 by ghostdog74 on Tuesday 26th of May 2009 06:00:54 AM
Old 05-26-2009
if you have Python, an almost full alternative solution
Code:
 
#!/usr/bin/env python
import urllib2,re
pat=re.compile(""".*<span class="tdBlancBold">(.*)<div align="center">.*""",re.M|re.DOTALL)
days=['lundi','mardi','mercredi','jeudi','vendredi','samedi','dimanche']
url="http://www.natureetdecouvertes.com/pages/gener/view_FO_STORE_corgen.asp?mag_cod=%s"
for num in range(101,174):
    page=urllib2.urlopen(url % str(num))    
    data=page.read()
    if not "Impossible" in data:
        result = pat.findall(data)       
        store={}
        for i in result:
            for j in i.split("<br>"):
                j=j.strip()
                if j.startswith("le"):
                    j=j.split()
                    if j[1] in days:
                        t1,t2=j[-3],j[-1]
                        store.setdefault(j[1],[])
                        store[j[1]].extend([t1,t2])
        for DAY in days:
            try:
                print "%s |" %( ' '.join(store[DAY])),
            except: 
                print "\t\t|",
        print ""    
    else:
        print "Page not found ",url % str(num)

extract of output :
Code:
# python test.py
10.00 20.00 | 10.00 20.00 | 10.00 20.00 | 10.00 20.00 | 10.00 20.00 | 10.00 20.00 | 10.00 19.00 |
10.00 20.00 | 10.00 20.00 | 10.00 20.00 | 10.00 20.00 | 10.00 20.00 | 10.00 20.00 |             |
10.00 20.00 | 10.00 20.00 | 10.00 20.00 |               | 10.00 21.00 | 10.00 20.00 |           |
10.00 20.00 | 10.00 20.00 | 10.00 20.00 | 10.00 20.00 | 10.00 20.00 | 10.00 20.00 |             |
10.00 19.30 | 10.00 19.30 | 10.00 19.30 | 10.00 19.30 | 10.00 19.30 | 10.00 19.30 |             |
10.00 21.00 | 10.00 21.00 | 10.00 21.00 | 10.00 21.00 | 10.00 21.00 | 10.00 20.00 |             |
10.00 19.30 | 10.00 19.30 | 10.00 19.30 | 10.00 19.30 | 10.00 19.30 | 10.00 19.30 |             |
10.00 20.00 | 10.00 21.00 | 10.00 21.00 | 10.00 21.00 | 10.00 21.00 | 10.00 20.00 |             |
9.30 19.30 | 9.30 19.30 | 9.30 19.30 | 9.30 19.30 | 9.30 19.30 | 9.30 19.30 |           |
Page not found  http://www.natureetdecouvertes.com/pages/gener/view_FO_STORE_corgen.asp?mag_cod=110
10.00 20.00 | 10.00 20.00 | 10.00 20.00 | 10.00 20.00 | 10.00 20.00 | 10.00 20.00 |             |
10.00 19.30 | 10.00 19.30 | 10.00 19.30 | 10.00 19.30 | 10.00 19.30 | 10.00 19.30 |             |
10.00 20.00 | 10.00 20.00 | 10.00 20.00 | 10.00 20.00 | 10.00 20.00 | 10.00 20.00 |             |
10.00 20.00 | 10.00 20.00 | 10.00 20.00 | 10.00 20.00 | 10.00 20.00 | 10.00 20.00 |             |
9.30 19.30 | 9.30 19.30 | 9.30 19.30 | 9.30 19.30 | 9.30 19.30 | 9.30 19.30 |           |

 

10 More Discussions You Might Find Interesting

1. UNIX for Dummies Questions & Answers

Awk/Sed One liner for text replacement

Hi group, I want to replace the occurance of a particular text in a paragraph.I tried with Sed,but Sed only displays the result on the screen.How can i update the changes in the original file??? The solution should be a one liner using awk and sed. Thanks in advance. (5 Replies)
Discussion started by: bishnu.bhatta
5 Replies

2. Shell Programming and Scripting

text processing ( sed/awk)

hi.. I have a file having record on in 1 line.... I want every 400 characters in a new line... means in 1st line 1-400 in 2nd line - 401-800 etc pl help. (12 Replies)
Discussion started by: clx
12 Replies

3. Shell Programming and Scripting

sed or awk to parse this text

I am just beginning with sed and awk and understand that they are "per" line input. That is, they operate on each line individually, and output based on rules, etc. But I have multi-line text blocks that looks as follows, and wish to ONLY extract the text between the first hyphen (-) and the... (13 Replies)
Discussion started by: bulgin
13 Replies

4. Shell Programming and Scripting

awk or sed to format text file

hi all, i have a text file which looks like the below 01 02 abc Top 40 music Kidz Only! MC 851 MC 852 MC 853 7NOW Arch_Diac xyz2 abc h211 Commacc1 Commacc2 Commacc3 (4 Replies)
Discussion started by: posner
4 Replies

5. UNIX for Dummies Questions & Answers

Changing Text with sed or awk

I'm changing some html code on multiple web pages and I need to match particular phrases but keep some text within each phrase. E.G. I need to change this line: <DIV id="heading">Description:</DIV> into <span class="hlred">Description:</span><br /> The text "Description:" may... (2 Replies)
Discussion started by: hal8000
2 Replies

6. UNIX for Advanced & Expert Users

Need help either with awk or sed to get text between words

Hello All, My requirement is to get test between two words START & END, something like html tags Eg. Input file: START Line1 Line2 Line3 CLOSE START Line4 Line5 Line6 END START Line7 START Line8 (7 Replies)
Discussion started by: konerusuneel
7 Replies

7. Debian

Using awk and sed to replace text

Good Day Every one I have a problem finding and replacing text in some large files that will take a long time to manually edit. Example text file looks like this #Example Large Text File unix linux dos squid bind dance bike car plane What im trying to do is to edit all the... (4 Replies)
Discussion started by: linuxjunkie
4 Replies

8. Shell Programming and Scripting

awk or sed? rows text to co

Hello Friends! I would like to help the masters ... I have a file with the entry below and would like a script for that output: Input file: 001 1 01-20152142711532-24S 1637909825/05/2015BAHIA SERVICOS R F, ... (1 Reply)
Discussion started by: He2
1 Replies

9. Shell Programming and Scripting

Text replacement with awk or sed?

Hi guys, I worked for almost a half-day for the replacement of some text automatically with script. But no success. The problem is I have hundred of files, which need to be replaced with some new text. It's a painful work to work manually and it's so easy to do it wrong. For example, I... (2 Replies)
Discussion started by: liuzhencc
2 Replies

10. Shell Programming and Scripting

Format the text using sed or awk

I was able to figure out how to format a text. Raw Data: $ cat test Thu Aug 23 15:43:28 UTC 2018, hostname01, 232.02, 3, 0.00 Thu Aug 23 15:43:35 UTC 2018, hostname02, 231.09, 4, 0.31 Thu Aug 23 15:43:37 UTC 2018, hostname03, 241.67, 4, 0.43 (5 Replies)
Discussion started by: kenshinhimura
5 Replies
DateTime::Locale::fr(3) 				User Contributed Perl Documentation				   DateTime::Locale::fr(3)

NAME
DateTime::Locale::fr SYNOPSIS
use DateTime; my $dt = DateTime->now( locale => 'fr' ); print $dt->month_name(); DESCRIPTION
This is the DateTime locale package for French. DATA
This locale inherits from the DateTime::Locale::root locale. It contains the following data. Days Wide (format) lundi mardi mercredi jeudi vendredi samedi dimanche Abbreviated (format) lun. mar. mer. jeu. ven. sam. dim. Narrow (format) L M M J V S D Wide (stand-alone) lundi mardi mercredi jeudi vendredi samedi dimanche Abbreviated (stand-alone) lun. mar. mer. jeu. ven. sam. dim. Narrow (stand-alone) L M M J V S D Months Wide (format) janvier fevrier mars avril mai juin juillet aout septembre octobre novembre decembre Abbreviated (format) janv. fevr. mars avr. mai juin juil. aout sept. oct. nov. dec. Narrow (format) J F M A M J J A S O N D Wide (stand-alone) janvier fevrier mars avril mai juin juillet aout septembre octobre novembre decembre Abbreviated (stand-alone) janv. fevr. mars avr. mai juin juil. aout sept. oct. nov. dec. Narrow (stand-alone) J F M A M J J A S O N D Quarters Wide (format) 1er trimestre 2e trimestre 3e trimestre 4e trimestre Abbreviated (format) T1 T2 T3 T4 Narrow (format) T1 T2 T3 T4 Wide (stand-alone) 1er trimestre 2e trimestre 3e trimestre 4e trimestre Abbreviated (stand-alone) T1 T2 T3 T4 Narrow (stand-alone) 1 2 3 4 Eras Wide avant Jesus-Christ apres Jesus-Christ Abbreviated av. J.-C. ap. J.-C. Narrow av. J.-C. ap. J.-C. Date Formats Full 2008-02-05T18:30:30 = mardi 5 fevrier 2008 1995-12-22T09:05:02 = vendredi 22 decembre 1995 -0010-09-15T04:44:23 = samedi 15 septembre -10 Long 2008-02-05T18:30:30 = 5 fevrier 2008 1995-12-22T09:05:02 = 22 decembre 1995 -0010-09-15T04:44:23 = 15 septembre -10 Medium 2008-02-05T18:30:30 = 5 fevr. 2008 1995-12-22T09:05:02 = 22 dec. 1995 -0010-09-15T04:44:23 = 15 sept. -10 Short 2008-02-05T18:30:30 = 05/02/08 1995-12-22T09:05:02 = 22/12/95 -0010-09-15T04:44:23 = 15/09/-10 Default 2008-02-05T18:30:30 = 5 fevr. 2008 1995-12-22T09:05:02 = 22 dec. 1995 -0010-09-15T04:44:23 = 15 sept. -10 Time Formats Full 2008-02-05T18:30:30 = 18:30:30 UTC 1995-12-22T09:05:02 = 09:05:02 UTC -0010-09-15T04:44:23 = 04:44:23 UTC Long 2008-02-05T18:30:30 = 18:30:30 UTC 1995-12-22T09:05:02 = 09:05:02 UTC -0010-09-15T04:44:23 = 04:44:23 UTC Medium 2008-02-05T18:30:30 = 18:30:30 1995-12-22T09:05:02 = 09:05:02 -0010-09-15T04:44:23 = 04:44:23 Short 2008-02-05T18:30:30 = 18:30 1995-12-22T09:05:02 = 09:05 -0010-09-15T04:44:23 = 04:44 Default 2008-02-05T18:30:30 = 18:30:30 1995-12-22T09:05:02 = 09:05:02 -0010-09-15T04:44:23 = 04:44:23 Datetime Formats Full 2008-02-05T18:30:30 = mardi 5 fevrier 2008 18:30:30 UTC 1995-12-22T09:05:02 = vendredi 22 decembre 1995 09:05:02 UTC -0010-09-15T04:44:23 = samedi 15 septembre -10 04:44:23 UTC Long 2008-02-05T18:30:30 = 5 fevrier 2008 18:30:30 UTC 1995-12-22T09:05:02 = 22 decembre 1995 09:05:02 UTC -0010-09-15T04:44:23 = 15 septembre -10 04:44:23 UTC Medium 2008-02-05T18:30:30 = 5 fevr. 2008 18:30:30 1995-12-22T09:05:02 = 22 dec. 1995 09:05:02 -0010-09-15T04:44:23 = 15 sept. -10 04:44:23 Short 2008-02-05T18:30:30 = 05/02/08 18:30 1995-12-22T09:05:02 = 22/12/95 09:05 -0010-09-15T04:44:23 = 15/09/-10 04:44 Default 2008-02-05T18:30:30 = 5 fevr. 2008 18:30:30 1995-12-22T09:05:02 = 22 dec. 1995 09:05:02 -0010-09-15T04:44:23 = 15 sept. -10 04:44:23 Available Formats d (d) 2008-02-05T18:30:30 = 5 1995-12-22T09:05:02 = 22 -0010-09-15T04:44:23 = 15 EEEd (d EEE) 2008-02-05T18:30:30 = 5 mar. 1995-12-22T09:05:02 = 22 ven. -0010-09-15T04:44:23 = 15 sam. HHmm (HH:mm) 2008-02-05T18:30:30 = 18:30 1995-12-22T09:05:02 = 09:05 -0010-09-15T04:44:23 = 04:44 HHmmss (HH:mm:ss) 2008-02-05T18:30:30 = 18:30:30 1995-12-22T09:05:02 = 09:05:02 -0010-09-15T04:44:23 = 04:44:23 Hm (H:mm) 2008-02-05T18:30:30 = 18:30 1995-12-22T09:05:02 = 9:05 -0010-09-15T04:44:23 = 4:44 hm (h:mm a) 2008-02-05T18:30:30 = 6:30 PM 1995-12-22T09:05:02 = 9:05 AM -0010-09-15T04:44:23 = 4:44 AM Hms (H:mm:ss) 2008-02-05T18:30:30 = 18:30:30 1995-12-22T09:05:02 = 9:05:02 -0010-09-15T04:44:23 = 4:44:23 hms (h:mm:ss a) 2008-02-05T18:30:30 = 6:30:30 PM 1995-12-22T09:05:02 = 9:05:02 AM -0010-09-15T04:44:23 = 4:44:23 AM M (L) 2008-02-05T18:30:30 = 2 1995-12-22T09:05:02 = 12 -0010-09-15T04:44:23 = 9 Md (d/M) 2008-02-05T18:30:30 = 5/2 1995-12-22T09:05:02 = 22/12 -0010-09-15T04:44:23 = 15/9 MEd (EEE d/M) 2008-02-05T18:30:30 = mar. 5/2 1995-12-22T09:05:02 = ven. 22/12 -0010-09-15T04:44:23 = sam. 15/9 MMd (d/MM) 2008-02-05T18:30:30 = 5/02 1995-12-22T09:05:02 = 22/12 -0010-09-15T04:44:23 = 15/09 MMdd (dd/MM) 2008-02-05T18:30:30 = 05/02 1995-12-22T09:05:02 = 22/12 -0010-09-15T04:44:23 = 15/09 MMM (LLL) 2008-02-05T18:30:30 = fevr. 1995-12-22T09:05:02 = dec. -0010-09-15T04:44:23 = sept. MMMd (d MMM) 2008-02-05T18:30:30 = 5 fevr. 1995-12-22T09:05:02 = 22 dec. -0010-09-15T04:44:23 = 15 sept. MMMdd (dd MMM) 2008-02-05T18:30:30 = 05 fevr. 1995-12-22T09:05:02 = 22 dec. -0010-09-15T04:44:23 = 15 sept. MMMEd (E d MMM) 2008-02-05T18:30:30 = mar. 5 fevr. 1995-12-22T09:05:02 = ven. 22 dec. -0010-09-15T04:44:23 = sam. 15 sept. MMMMd (d MMMM) 2008-02-05T18:30:30 = 5 fevrier 1995-12-22T09:05:02 = 22 decembre -0010-09-15T04:44:23 = 15 septembre MMMMEd (EEE d MMMM) 2008-02-05T18:30:30 = mar. 5 fevrier 1995-12-22T09:05:02 = ven. 22 decembre -0010-09-15T04:44:23 = sam. 15 septembre mmss (mm:ss) 2008-02-05T18:30:30 = 30:30 1995-12-22T09:05:02 = 05:02 -0010-09-15T04:44:23 = 44:23 ms (mm:ss) 2008-02-05T18:30:30 = 30:30 1995-12-22T09:05:02 = 05:02 -0010-09-15T04:44:23 = 44:23 y (y) 2008-02-05T18:30:30 = 2008 1995-12-22T09:05:02 = 1995 -0010-09-15T04:44:23 = -10 yM (M/yyyy) 2008-02-05T18:30:30 = 2/2008 1995-12-22T09:05:02 = 12/1995 -0010-09-15T04:44:23 = 9/-010 yMEd (EEE d/M/yyyy) 2008-02-05T18:30:30 = mar. 5/2/2008 1995-12-22T09:05:02 = ven. 22/12/1995 -0010-09-15T04:44:23 = sam. 15/9/-010 yMMM (MMM y) 2008-02-05T18:30:30 = fevr. 2008 1995-12-22T09:05:02 = dec. 1995 -0010-09-15T04:44:23 = sept. -10 yMMMEd (EEE d MMM y) 2008-02-05T18:30:30 = mar. 5 fevr. 2008 1995-12-22T09:05:02 = ven. 22 dec. 1995 -0010-09-15T04:44:23 = sam. 15 sept. -10 yMMMM (MMMM y) 2008-02-05T18:30:30 = fevrier 2008 1995-12-22T09:05:02 = decembre 1995 -0010-09-15T04:44:23 = septembre -10 yQ ('T'Q y) 2008-02-05T18:30:30 = T1 2008 1995-12-22T09:05:02 = T4 1995 -0010-09-15T04:44:23 = T3 -10 yQQQ (QQQ y) 2008-02-05T18:30:30 = T1 2008 1995-12-22T09:05:02 = T4 1995 -0010-09-15T04:44:23 = T3 -10 yyMM (MM/yy) 2008-02-05T18:30:30 = 02/08 1995-12-22T09:05:02 = 12/95 -0010-09-15T04:44:23 = 09/-10 yyMMM (MMM yy) 2008-02-05T18:30:30 = fevr. 08 1995-12-22T09:05:02 = dec. 95 -0010-09-15T04:44:23 = sept. -10 yyMMMd (d MMM yy) 2008-02-05T18:30:30 = 5 fevr. 08 1995-12-22T09:05:02 = 22 dec. 95 -0010-09-15T04:44:23 = 15 sept. -10 yyMMMEEEd (EEE d MMM yy) 2008-02-05T18:30:30 = mar. 5 fevr. 08 1995-12-22T09:05:02 = ven. 22 dec. 95 -0010-09-15T04:44:23 = sam. 15 sept. -10 yyQ ('T'Q yy) 2008-02-05T18:30:30 = T1 08 1995-12-22T09:05:02 = T4 95 -0010-09-15T04:44:23 = T3 -10 yyQQQQ (QQQQ yy) 2008-02-05T18:30:30 = 1er trimestre 08 1995-12-22T09:05:02 = 4e trimestre 95 -0010-09-15T04:44:23 = 3e trimestre -10 yyyyMMMM (MMMM y) 2008-02-05T18:30:30 = fevrier 2008 1995-12-22T09:05:02 = decembre 1995 -0010-09-15T04:44:23 = septembre -10 Miscellaneous Prefers 24 hour time? Yes Local first day of the week lundi SUPPORT
See DateTime::Locale. AUTHOR
Dave Rolsky <autarch@urth.org> COPYRIGHT
Copyright (c) 2008 David Rolsky. All rights reserved. This program is free software; you can redistribute it and/or modify it under the same terms as Perl itself. This module was generated from data provided by the CLDR project, see the LICENSE.cldr in this distribution for details on the CLDR data's license. perl v5.18.2 2017-10-06 DateTime::Locale::fr(3)
All times are GMT -4. The time now is 02:29 PM.
Unix & Linux Forums Content Copyright 1993-2022. All Rights Reserved.
Privacy Policy