Sponsored Content
Top Forums Shell Programming and Scripting what is the better way to validate records in a file. Post 302716251 by Scrutinizer on Tuesday 16th of October 2012 05:44:14 AM
Old 10-16-2012
Sure:

Code:
awk '
  NR==FNR{                                 # When the first file is being read (only then are FNR and NR equal)
    W[$2]=$3                               # create an (associative) array element for the column widths with the second 
                                           # field as the index using the Field separator (FS) (see below)
    next                                   # Proceed to the next record
  }
  {
    for(i in W)                            # for every line in the second file, for every column in array W
      if(length($i)>W[i]){                 # if the length of the corresponding field is more than the max column width then
        print > "file.bad"                 # print that record of the second file to "file.bad"
        next                               # Proceed to the next record
      }
  }
  1                                        # If there are no fields with more characters than the max column width then print the record..
' FS='[^0-9]*' colwidthfile FS=\| file     # Set FS to any sequence of non-digits for the first file. Set it to "|" for the second file.

 

10 More Discussions You Might Find Interesting

1. Shell Programming and Scripting

validate the file name

write a shell script that check file name like pstat_24.txt (up to 5 digits) i mean to say this digit can be range from 1 to 99999 only correct file name are pstat_10000.txt pstat_12345.txt pstat_14569.txt wrong file name are pstat_1234567.txt pstat_1a2345.txt... (2 Replies)
Discussion started by: maykap100
2 Replies

2. Shell Programming and Scripting

How to validate a CSV file?

Hi. I think some people have already asked this, but the answers/questions seem to be about validating the contents inside a CSV file. I am simply after a simple variable solution (ie 0 = false, 1 = true) that I can use in my script to say that file so-and-so is actually a CSV file, or in some... (4 Replies)
Discussion started by: ElCaito
4 Replies

3. Shell Programming and Scripting

validate against a file

Hello all, I am having problem in writing a if condition for the following: I have a file Instance.dat which has: #Server Environment server1 dev server2 dev server3 sit #!/bin/ksh ENV=dev for i in $( cat Instances.dat | grep -v '#' |awk {'print $2'} ) do if ]... (7 Replies)
Discussion started by: chiru_h
7 Replies

4. Shell Programming and Scripting

Count No of Records in File without counting Header and Trailer Records

I have a flat file and need to count no of records in the file less the header and the trailer record. I would appreciate any and all asistance Thanks Hadi Lalani (2 Replies)
Discussion started by: guiguy
2 Replies

5. Shell Programming and Scripting

Validate the file

How do we validate the header file. The file number should increament by 1 (position 17 to 19) if not abend the process. first week ABC0001 20100101123 second week ABC0001 20100108124 Third week ABC0001 20100115125 (7 Replies)
Discussion started by: zooby
7 Replies

6. UNIX for Dummies Questions & Answers

Grep specific records from a file of records that are separated by an empty line

Hi everyone. I am a newbie to Linux stuff. I have this kind of problem which couldn't solve alone. I have a text file with records separated by empty lines like this: ID: 20 Name: X Age: 19 ID: 21 Name: Z ID: 22 Email: xxx@yahoo.com Name: Y Age: 19 I want to grep records that... (4 Replies)
Discussion started by: Atrisa
4 Replies

7. Shell Programming and Scripting

Deleting duplicate records from file 1 if records from file 2 match

I have 2 files "File 1" is delimited by ";" and "File 2" is delimited by "|". File 1 below (3 record shown): Doc1;03/01/2012;New York;6 Main Street;Mr. Smith 1;Mr. Jones Doc2;03/01/2012;Syracuse;876 Broadway;John Davis;Barbara Lull Doc3;03/01/2012;Buffalo;779 Old Windy Road;Charles... (2 Replies)
Discussion started by: vestport
2 Replies

8. UNIX for Advanced & Expert Users

Wanted best way to validate delimited file records

actually i post about this issue before but many folkz miss-understood with my quesion, We are checking for the delimited file records validation Delimited file will have data like this: Aaaa|sdfhxfgh|sdgjhxfgjh|sdgjsdg|sgdjsg| Aaaa|sdfhxfgh|sdgjhxfgjh|sdgjsdg|sgdjsg|... (3 Replies)
Discussion started by: Seshendranath
3 Replies

9. Shell Programming and Scripting

Separate records of a file on 2 types of records

Hi I am new to shell programming in unix Please if I can provide help. I have a file structure of a header record and "N" detail records. The header record will be the total number of detail records I need to split the file in 2: One for the header Another for all detail records Could... (1 Reply)
Discussion started by: jamcogar
1 Replies

10. UNIX for Beginners Questions & Answers

Validate csv file

Hi guys, i want to validate the no.of colums in a csv file ,but if there is a comma(,) in any of the data values it should skip and count only valid (,) commas. e.g 1.abc,pqrs,1234,567,hhh result :4 2.abc,pqrs,1234,567,hhh,"in,valid",end12 result:6 here script should skip the comma inside... (10 Replies)
Discussion started by: harry123
10 Replies
bup-margin(1)						      General Commands Manual						     bup-margin(1)

NAME
bup-margin - figure out your deduplication safety margin SYNOPSIS
bup margin [options...] DESCRIPTION
bup margin iterates through all objects in your bup repository, calculating the largest number of prefix bits shared between any two entries. This number, n, identifies the longest subset of SHA-1 you could use and still encounter a collision between your object ids. For example, one system that was tested had a collection of 11 million objects (70 GB), and bup margin returned 45. That means a 46-bit hash would be sufficient to avoid all collisions among that set of objects; each object in that repository could be uniquely identified by its first 46 bits. The number of bits needed seems to increase by about 1 or 2 for every doubling of the number of objects. Since SHA-1 hashes have 160 bits, that leaves 115 bits of margin. Of course, because SHA-1 hashes are essentially random, it's theoretically possible to use many more bits with far fewer objects. If you're paranoid about the possibility of SHA-1 collisions, you can monitor your repository by running bup margin occasionally to see if you're getting dangerously close to 160 bits. OPTIONS
--predict Guess the offset into each index file where a particular object will appear, and report the maximum deviation of the correct answer from the guess. This is potentially useful for tuning an interpolation search algorithm. --ignore-midx don't use .midx files, use only .idx files. This is only really useful when used with --predict. EXAMPLE
$ bup margin Reading indexes: 100.00% (1612581/1612581), done. 40 40 matching prefix bits 1.94 bits per doubling 120 bits (61.86 doublings) remaining 4.19338e+18 times larger is possible Everyone on earth could have 625878182 data sets like yours, all in one repository, and we would expect 1 object collision. $ bup margin --predict PackIdxList: using 1 index. Reading indexes: 100.00% (1612581/1612581), done. 915 of 1612581 (0.057%) SEE ALSO
bup-midx(1), bup-save(1) BUP
Part of the bup(1) suite. AUTHORS
Avery Pennarun <apenwarr@gmail.com>. Bup unknown- bup-margin(1)
All times are GMT -4. The time now is 08:58 AM.
Unix & Linux Forums Content Copyright 1993-2022. All Rights Reserved.
Privacy Policy