Check whether a given file is in ASCII format and data is tab-delimited


 
Thread Tools Search this Thread
Top Forums Shell Programming and Scripting Check whether a given file is in ASCII format and data is tab-delimited
# 8  
Old 05-02-2007
You haven't answered the questions I asked, so let's start from the beginning.

(I have reformatted your post so that it is easier to understand.)

Quote:
Originally Posted by Mandab
I am totally confused now.
Since I am a newbie and wrote the above script with the help of this forum.
I'll get a file which is

tab delimited and
from 3rd line onwards it has data.
First field is numeric(9) not null and
second field is char(8) not null,
third field is numeric(9) null and
fourth field is (13) not null.

Are there only four fields?
If there are more, what conditions must they meet?
If a field has a length greater the 0, then it is not null; or do you mean something else by "not null"?
Quote:
My requirement is first to check whether it is in tab-delimited format or not.
If it is not then generate error and put it in var1 "101" and var2="Not in tab-delimited format"
and if it is in tab-delimited format then check whether
first field datatype and length and also for not null value,
if doesn't match then var1 "110" and var2="Mismatch/Wrong Field one"
if matches then check second field and
put var1= "120" and var2= "Mismatch/Wrong Field two" and so on.

I want to use var1 and var2 to be used for other computation. Whatever comments you have written above have gone over my head. Please help me.

This script checks the first four fields per line.

It also gives the line number where the error occurred.
Code:
errline=$( awk 'BEGIN { IFS = "\t" }
   NR <= 2 { next } ## skip the first two lines
   !/\t/ { exit 101 }  ## line does not contain a tab

   ## Fields 1 and 3 must be 9 characters and contain only digits
   length($1) != 9 || $1 ~ /[^0-9]/ { exit 110 }
   length($3) != 9 || $1 ~ /[^0-9]/ { exit 130 }

   ## Fields 2 and 4 must be 8 and 13 characters respectively
   length($2) != 8                  { exit 120 }
   length($4) != 13                 { exit 140 }

END { print NR }
' "$FILE"
)

var1=$?  ## Set variable to the exit code of the awk script

## Assign var2 based on awk's return code
case $var1 in
   101) var2="Not in tab-delimited format" ;;
   110) var2="Mismatch/Wrong Field one" ;;
   120) var2="Mismatch/Wrong Field two" ;;
   130) var2="Mismatch/Wrong Field three" ;;
   140) var2="Mismatch/Wrong Field four" ;;
esac

printf "Error number: %d, line %d\n" "$var1" "$errline"
printf "Error message: %s\n" "$var2"


Last edited by cfajohnson; 05-03-2007 at 02:01 PM.. Reason: Fixed typo
# 9  
Old 05-03-2007
Marvellous !!
Its working perfectly.

Thank you so much. I appreciate it.
# 10  
Old 05-03-2007
It is not checking for tab-delimited format. The following is the error I am getting, can you pl help?

Script:
#!/bin/ksh

awk -F\t 'NR>=3 { ## to start from 3rd line
!/\t/ {exit 101} ## not working
if ( $2 ~ /^ *$/ || $2 ~ /[^0-9]/ || length($1)!=7 ) {exit 102}
if ( $6 ~ /^ *$/ || $6 ~ /[^0-9]/ || length($6) !=5 ) {exit 106}
}' $1
var1=$? ## set variable to the exit code of awk script

## Assign var2 based on awk's return code
case $var1 in
101) var2="File not in tab-delimited format" ;;
102) var2="Mismatch/Wrong Field2" ;;
106) var2="Mismatch/Wrong Field6" ;;
*) var2="Success" ;;
esac
print "$var1"
print "$var2"


I am getting error :
$ test7 sample.txt
awk: syntax error near line 2
awk: illegal statement near line 2
awk: bailing out near line 3

The sampe file is:
xxyyzz20070503100717001
abcd.txt000000027600000002
1234567 3809363 175268 849036 94425 284437
2271208 3809365 175268 849036 94425 284437
Login or Register to Ask a Question

Previous Thread | Next Thread

10 More Discussions You Might Find Interesting

1. UNIX for Beginners Questions & Answers

Creating data delimited by ASCII code 1

<Any suggestion how to create a file where the values are separated by ASCII code 1,with data extracted from a table using shell script The format is :/> <columnname1(binary1)columnvalue(binary1)columnname2(binary1)columnvalue(binary1)columnname3(binary1)columnvalue... 1st row/>... (6 Replies)
Discussion started by: dasun
6 Replies

2. UNIX for Beginners Questions & Answers

Replace a column in tab delimited file with column in other tab delimited file,based on match

Hello Everyone.. I want to replace the retail col from FileI with cstp1 col from FileP if the strpno matches in both files FileP.txt ... (2 Replies)
Discussion started by: YogeshG
2 Replies

3. UNIX for Beginners Questions & Answers

Check if file is EBCDIC or ASCII format

So, i have this requirement where i need to check the file format, whether it's EBCDIC or ASCII, and based on format retrieve the information from that file: my file is: file1.txt-->this ebcdic file file2.txt-->ascii file i tried below code: file=file1.txt type="`file $file`" i get... (7 Replies)
Discussion started by: gnnsprapa
7 Replies

4. UNIX for Dummies Questions & Answers

Need to convert a pipe delimited text file to tab delimited

Hi, I have a rquirement in unix as below . I have a text file with me seperated by | symbol and i need to generate a excel file through unix commands/script so that each value will go to each column. ex: Input Text file: 1|A|apple 2|B|bottle excel file to be generated as output as... (9 Replies)
Discussion started by: raja kakitapall
9 Replies

5. Shell Programming and Scripting

How to read data from tab delimited file after a specific position?

Hi Experts, I have a tab deliminated file as below myfile.txt Local Group Memberships *Administrators *Guests I need data in below format starting from 4th position. myfile1.txt Administrators Guests the above one is just an example and there could... (15 Replies)
Discussion started by: Litu1988
15 Replies

6. Shell Programming and Scripting

Parse tab delimited file, check condition and delete row

I am fairly new to programming and trying to resolve this problem. I have the file like this. CHROM POS REF ALT 10_sample.bam 11_sample.bam 12_sample.bam 13_sample.bam 14_sample.bam 15_sample.bam 16_sample.bam tg93 77 T C T T T T T tg93 79 ... (4 Replies)
Discussion started by: empyrean
4 Replies

7. Shell Programming and Scripting

Append output in tab delimited format

hello.. i m scripting in Perl and having issue writing the output in specific format..i read two files and run some commands and write output to one file. i want this to be a 2d table, File one has 48 rows and file two has 48 rows, first i take one id from file one, and go to second file, loop... (2 Replies)
Discussion started by: empyrean
2 Replies

8. UNIX for Dummies Questions & Answers

How to convert a text file into tab delimited format?

I have a text file that made using text editor in Ubuntu. However the text file is not being recognized as space or tab delimited, the formatting seems to be messed up. How can I convert the text file into tab delimited format? (3 Replies)
Discussion started by: evelibertine
3 Replies

9. Shell Programming and Scripting

Extracting a portion of data from a very large tab delimited text file

Hi All I wanted to know how to effectively delete some columns in a large tab delimited file. I have a file that contains 5 columns and almost 100,000 rows 3456 f g t t 3456 g h 456 f h 4567 f g h z 345 f g 567 h j k lThis is a very large data file and tab delimited. I need... (2 Replies)
Discussion started by: Lucky Ali
2 Replies

10. Shell Programming and Scripting

how to check the file data type(ascii or binary)

hi i am receiving a file from one system , i have to verify the format of the file data i.e whether the data is in acii format or binary format, please help thanks in advance satya (1 Reply)
Discussion started by: Satyak
1 Replies
Login or Register to Ask a Question