Sponsored Content
Top Forums Shell Programming and Scripting Parsing a txt file according to two different tags Post 302936130 by RudiC on Monday 23rd of February 2015 08:19:12 AM
Old 02-23-2015
Any attempts from your side?

There are three separated records with that name in your sample file. So - what exactly do you want to achieve?

---------- Post updated at 14:19 ---------- Previous update was at 14:17 ----------

And, what's the "two different tags" in the thread header?
 

10 More Discussions You Might Find Interesting

1. UNIX for Dummies Questions & Answers

Binary txt file received when i use uuencode to send txt file as attachment

Hi, I have already read a lot of posts on sending attachments in unix...but none of them were of help for my problem...so here goes.. i wanna attach a text file and send to a mail id..used the following code : uuencode "$File1" "$File1" ;|mail -s "$Mail_sub" abc@abc.com it works... (2 Replies)
Discussion started by: ash22
2 Replies

2. Shell Programming and Scripting

command to list .txt and .TXT file

Hi expersts, in my directory i have *.txt and *.TXT and *.TXT.log, *.txt.log I want list only .txt and .TXT files in one command... how to ?? //purple (1 Reply)
Discussion started by: thepurple
1 Replies

3. Shell Programming and Scripting

help parsing txt with awk

Hi all I would need some help to introduce 'break' lines into the following data by using awk: original data: RECORD 1000 aaa xxxxxx RECORD 1001 aaa xxxxxx RECORD 1002 bbb xxxxxx RECORD 1003 bbb xxxxxx RECORD 1004 bbb xxxxxx RECORD 1005 ccc xxxxxx RECORD 1006 ccc xxxxxx RECORD 1007... (5 Replies)
Discussion started by: yomaya
5 Replies

4. Shell Programming and Scripting

parsing txt file, saving graphics files

hi everyone, i am a newbie in shell programming. and i want to simply go through a text file that contains 3 "columns", split by ';' customerID ; link-to-contract ; save-as-filename so an example would simply look like this now i want to loop through every line, and save the file from... (3 Replies)
Discussion started by: Confidence
3 Replies

5. Shell Programming and Scripting

Parsing txt, xml files and preparing csv file

Hi, I need to parse text, xml files to get the statistic numbers and prepare summary csv file. What is the best way to parse these file and prepare csv file. Any idea you have , please? Regards, (2 Replies)
Discussion started by: LinuxLearner
2 Replies

6. Shell Programming and Scripting

awk append fileA.txt to growing file B.txt

This is appending a column. My question is fairly simple. I have a program generating data in a form like so: 1 20 2 22 3 23 4 12 5 43 For ever iteration I'm generating this data. I have the basic idea with cut -f 2 fileA.txt | paste -d >> FileB.txt ???? I want FileB.txt to grow, and... (4 Replies)
Discussion started by: theawknewbie
4 Replies

7. Shell Programming and Scripting

Need to append the date | abcddate.txt to the first line of my txt file

I want to add/append the info in the following format to my.txt file. 20130702|abcd20130702.txt FN|SN|DOB I tried the below script but it throws me some exceptions. <#!/bin/sh dt = date '+%y%m%d'members; echo $dt+|+members+$dt; /usr/bin/awk -f BEGIN { FS="|"; OFS="|"; } { print... (6 Replies)
Discussion started by: harik1982
6 Replies

8. Shell Programming and Scripting

XML Parsing having optional tags into flat file

In xml file i have following data where some tags like<ChrgBr> may not be present in every next file. So i want these values to be stored in some variable like var1="405360,00" , var2="DEBT" and so on ,but if <ChrgBr> tag has no value or is absent var2 should have space like var2=" " so that i... (1 Reply)
Discussion started by: sandipgawale
1 Replies

9. Windows & DOS: Issues & Discussions

Parsing a UNIX txt to separate files

I have a requirement to parse a dataflex .txt file and break it into separate files within the Windows server env. Is there any special characters I should pay particular attention on the unix side? Any ideas? Thanks in advance (2 Replies)
Discussion started by: kicklinr
2 Replies

10. Shell Programming and Scripting

Desired output.txt for reading txt file using awk?

Dear all, I have a huge txt file (DATA.txt) with the following content . From this txt file, I want the following output using some shell script. Any help is greatly appreciated. Greetings, emily DATA.txt (snippet of the huge text file) 407202849... (2 Replies)
Discussion started by: emily
2 Replies
WWW::RobotRules(3)					User Contributed Perl Documentation					WWW::RobotRules(3)

NAME
WWW::RobotsRules - Parse robots.txt files SYNOPSIS
require WWW::RobotRules; my $robotsrules = new WWW::RobotRules 'MOMspider/1.0'; use LWP::Simple qw(get); $url = "http://some.place/robots.txt"; my $robots_txt = get $url; $robotsrules->parse($url, $robots_txt); $url = "http://some.other.place/robots.txt"; my $robots_txt = get $url; $robotsrules->parse($url, $robots_txt); # Now we are able to check if a URL is valid for those servers that # we have obtained and parsed "robots.txt" files for. if($robotsrules->allowed($url)) { $c = get $url; ... } DESCRIPTION
This module parses a /robots.txt file as specified in "A Standard for Robot Exclusion", described in <http://info.webcrawler.com/mak/projects/robots/norobots.html> Webmasters can use the /robots.txt file to disallow conforming robots access to parts of their web site. The parsed file is kept in the WWW::RobotRules object, and this object provides methods to check if access to a given URL is prohibited. The same WWW::RobotRules object can parse multiple /robots.txt files. The following methods are provided: $rules = WWW::RobotRules->new($robot_name) This is the constructor for WWW::RobotRules objects. The first argument given to new() is the name of the robot. $rules->parse($robot_txt_url, $content, $fresh_until) The parse() method takes as arguments the URL that was used to retrieve the /robots.txt file, and the contents of the file. $rules->allowed($uri) Returns TRUE if this robot is allowed to retrieve this URL. $rules->agent([$name]) Get/set the agent name. NOTE: Changing the agent name will clear the robots.txt rules and expire times out of the cache. ROBOTS.TXT The format and semantics of the "/robots.txt" file are as follows (this is an edited abstract of <http://info.webcrawler.com/mak/projects/robots/norobots.html>): The file consists of one or more records separated by one or more blank lines. Each record contains lines of the form <field-name>: <value> The field name is case insensitive. Text after the '#' character on a line is ignored during parsing. This is used for comments. The following <field-names> can be used: User-Agent The value of this field is the name of the robot the record is describing access policy for. If more than one User-Agent field is present the record describes an identical access policy for more than one robot. At least one field needs to be present per record. If the value is '*', the record describes the default access policy for any robot that has not not matched any of the other records. Disallow The value of this field specifies a partial URL that is not to be visited. This can be a full path, or a partial path; any URL that starts with this value will not be retrieved ROBOTS.TXT EXAMPLES The following example "/robots.txt" file specifies that no robots should visit any URL starting with "/cyberworld/map/" or "/tmp/": User-agent: * Disallow: /cyberworld/map/ # This is an infinite virtual URL space Disallow: /tmp/ # these will soon disappear This example "/robots.txt" file specifies that no robots should visit any URL starting with "/cyberworld/map/", except the robot called "cybermapper": User-agent: * Disallow: /cyberworld/map/ # This is an infinite virtual URL space # Cybermapper knows where to go. User-agent: cybermapper Disallow: This example indicates that no robots should visit this site further: # go away User-agent: * Disallow: / SEE ALSO
LWP::RobotUA, WWW::RobotRules::AnyDBM_File libwww-perl-5.65 2001-04-20 WWW::RobotRules(3)
All times are GMT -4. The time now is 04:57 PM.
Unix & Linux Forums Content Copyright 1993-2022. All Rights Reserved.
Privacy Policy