Sponsored Content
Full Discussion: Validating a file
Top Forums UNIX for Dummies Questions & Answers Validating a file Post 302469361 by dgeehot on Friday 5th of November 2010 04:05:40 PM
Old 11-05-2010
Tools Validating a file

Pardon my ignorance but I am lost on how to do this

I have a file called "Sample.txt", it is pipe delimited, it should have 13 fields. But some of the records do not, I would like to set up a shell script
where I can pass in a parameter "Sample.txt" and it would split the file into records that have 13 fields and those that dont. An output two files
"Sample.txt.good" and Sample.txt.bad"

Any help would be greatfully appreciated
 

9 More Discussions You Might Find Interesting

1. Shell Programming and Scripting

Validating inputs from a file

Hi, I have a file called inputs. Now that file has the values like this: 1 2 3 Now In my script called 'get.sh' I do this : exec < inputs read a b c d Now I know that there will not be any value in d. How can I check it. I need the exact condition for checking whether the variable has... (1 Reply)
Discussion started by: sendhilmani123
1 Replies

2. Shell Programming and Scripting

validating a file or directory

Hi there, im writing a script and trying the get the 2nd parameter and check it if its valid file or valid directory, Example: ./test -a quiz1 i need to check quiz1 ($2) if it matches any name of a file or directory. Thanks (3 Replies)
Discussion started by: new2Linux
3 Replies

3. Shell Programming and Scripting

validating a file based on conditions

i have a file in unix in which the records are like this aaa 123 233 aaa 234 222 aaa 242 222 bbb 122 111 bbb 122 123 ccc 124 222 In the output i want only the below records aaa ccc The validation logic is 1st column and 2nd column need to be considered if both columns values are... (8 Replies)
Discussion started by: trichyselva
8 Replies

4. Shell Programming and Scripting

validating a input file for numeric and character

i have a input file like this 001|rahim|bajaj|20090102 while reading the file i need to check whether the first column is a number second column is a name is there any methodology to check for the same thanks in advance (2 Replies)
Discussion started by: trichyselva
2 Replies

5. Shell Programming and Scripting

Shell script for validating fields in a file

Hi, I have not used Unix in a very long time and I am very rusty. I would appreciate any help I can get from the more experienced and experts in Shell script. I am reading one file at a time from a folder. The file is a flat file with no delimeters or carriage return. Col1 through col6 is... (5 Replies)
Discussion started by: asemota
5 Replies

6. Shell Programming and Scripting

Validating the file

Hello All, I have the following file. The first column is Type. Always the file will have one H and one T type in between all D type reocrds. Need todo some validations. H|ABCD D|TAB N0003107809CD2013-11-14|RYAN|FRY|7 DR|RICHMOND HILL|GA|32431|X|C95|000009999|000000001|TAB||C0001 D|TAB... (3 Replies)
Discussion started by: karumudi7
3 Replies

7. Shell Programming and Scripting

Validating XML file using XSD in UNIX

Hi, I have a xml file and a xsd file(xml schema file). Here using unix script i wanted to validate the xml file by referring to xsd file. The validation is in terms of Datatype,Field length and null values. If the data present in the xml file is not matching in terms of datatype,field length... (3 Replies)
Discussion started by: shree11
3 Replies

8. Shell Programming and Scripting

Csv file parsing and validating

Hi, I have basic knowledge on unix shell scripting(not an expert). My requirement is reading the csv file using the schema defined in the configuration file and if the condition is not mached then move the unmatched record to a error file and matched good records into other file. In brief: ... (43 Replies)
Discussion started by: shree11
43 Replies

9. Shell Programming and Scripting

Validating a file in ksh

Ladies and Gents, I need to be able to verify the file and validate the file format. Is there a way to verify this using ksh? Please forgive me, I'm very new to UNIX scripting. 12345_dbname_1_sqlname.sql 12345 - change number dbname - database name 1 - sequence sqlname - sql... (4 Replies)
Discussion started by: Philani
4 Replies
WWW::RobotRules(3)					User Contributed Perl Documentation					WWW::RobotRules(3)

NAME
WWW::RobotRules - database of robots.txt-derived permissions SYNOPSIS
use WWW::RobotRules; my $rules = WWW::RobotRules->new('MOMspider/1.0'); use LWP::Simple qw(get); { my $url = "http://some.place/robots.txt"; my $robots_txt = get $url; $rules->parse($url, $robots_txt) if defined $robots_txt; } { my $url = "http://some.other.place/robots.txt"; my $robots_txt = get $url; $rules->parse($url, $robots_txt) if defined $robots_txt; } # Now we can check if a URL is valid for those servers # whose "robots.txt" files we've gotten and parsed: if($rules->allowed($url)) { $c = get $url; ... } DESCRIPTION
This module parses /robots.txt files as specified in "A Standard for Robot Exclusion", at <http://www.robotstxt.org/wc/norobots.html> Webmasters can use the /robots.txt file to forbid conforming robots from accessing parts of their web site. The parsed files are kept in a WWW::RobotRules object, and this object provides methods to check if access to a given URL is prohibited. The same WWW::RobotRules object can be used for one or more parsed /robots.txt files on any number of hosts. The following methods are provided: $rules = WWW::RobotRules->new($robot_name) This is the constructor for WWW::RobotRules objects. The first argument given to new() is the name of the robot. $rules->parse($robot_txt_url, $content, $fresh_until) The parse() method takes as arguments the URL that was used to retrieve the /robots.txt file, and the contents of the file. $rules->allowed($uri) Returns TRUE if this robot is allowed to retrieve this URL. $rules->agent([$name]) Get/set the agent name. NOTE: Changing the agent name will clear the robots.txt rules and expire times out of the cache. ROBOTS.TXT The format and semantics of the "/robots.txt" file are as follows (this is an edited abstract of <http://www.robotstxt.org/wc/norobots.html>): The file consists of one or more records separated by one or more blank lines. Each record contains lines of the form <field-name>: <value> The field name is case insensitive. Text after the '#' character on a line is ignored during parsing. This is used for comments. The following <field-names> can be used: User-Agent The value of this field is the name of the robot the record is describing access policy for. If more than one User-Agent field is present the record describes an identical access policy for more than one robot. At least one field needs to be present per record. If the value is '*', the record describes the default access policy for any robot that has not not matched any of the other records. The User-Agent fields must occur before the Disallow fields. If a record contains a User-Agent field after a Disallow field, that constitutes a malformed record. This parser will assume that a blank line should have been placed before that User-Agent field, and will break the record into two. All the fields before the User-Agent field will constitute a record, and the User-Agent field will be the first field in a new record. Disallow The value of this field specifies a partial URL that is not to be visited. This can be a full path, or a partial path; any URL that starts with this value will not be retrieved Unrecognized records are ignored. ROBOTS.TXT EXAMPLES The following example "/robots.txt" file specifies that no robots should visit any URL starting with "/cyberworld/map/" or "/tmp/": User-agent: * Disallow: /cyberworld/map/ # This is an infinite virtual URL space Disallow: /tmp/ # these will soon disappear This example "/robots.txt" file specifies that no robots should visit any URL starting with "/cyberworld/map/", except the robot called "cybermapper": User-agent: * Disallow: /cyberworld/map/ # This is an infinite virtual URL space # Cybermapper knows where to go. User-agent: cybermapper Disallow: This example indicates that no robots should visit this site further: # go away User-agent: * Disallow: / This is an example of a malformed robots.txt file. # robots.txt for ancientcastle.example.com # I've locked myself away. User-agent: * Disallow: / # The castle is your home now, so you can go anywhere you like. User-agent: Belle Disallow: /west-wing/ # except the west wing! # It's good to be the Prince... User-agent: Beast Disallow: This file is missing the required blank lines between records. However, the intention is clear. SEE ALSO
LWP::RobotUA, WWW::RobotRules::AnyDBM_File COPYRIGHT
Copyright 1995-2009, Gisle Aas Copyright 1995, Martijn Koster This library is free software; you can redistribute it and/or modify it under the same terms as Perl itself. perl v5.18.2 2012-02-18 WWW::RobotRules(3)
All times are GMT -4. The time now is 10:51 AM.
Unix & Linux Forums Content Copyright 1993-2022. All Rights Reserved.
Privacy Policy