Checking number of commas in each line.


 
Thread Tools Search this Thread
Top Forums Shell Programming and Scripting Checking number of commas in each line.
# 1  
Old 04-04-2013
Checking number of commas in each line.

Hi All,

I am checking whether each line is having "n" number of commas or nor. In case not then I need to exit the process.

I tried

Code:
cat "$TEMP_FILE" | while read LINE
do 
	processing_line=`expr $processing_line + 1`
	no_of_delimiters=`echo "$LINE" | awk -F ',' '{ print NF }'`
	if [ $no_of_delimiters -ne $no_of_expected_fields ]
                echo "Error at line $processing_line"
		exit
	fi
done

It's working fine. However the number of records in the file is around .5 million. So it's taking too much time to process it. Anyway I can improve the performance?

Thanks in advanced.
# 2  
Old 04-04-2013
You can replace your whole code with one awk statement:

Code:
awk -F,  'NF>3{print "Error at line "NR;exit}' $TEMP_FILE

In place of 3, put the count of your no. of expected fields.

Guru.

Last edited by guruprasadpr; 04-04-2013 at 09:29 AM.. Reason: Updated NF to NR, thanks to Pikk45 for pointing it out
# 3  
Old 04-04-2013
Don't cat a very huge file.

Try like
Code:
awk -F"," -v l=$no_of_expected_fields '{if(NF != l){print "Error at line "NR; exit}}' $TEMPFILE

@guru:
You got me on this one Smilie

---------- Post updated at 05:57 PM ---------- Previous update was at 05:52 PM ----------

Quote:
Originally Posted by guruprasadpr
You can replace your whole code with one awk statement:

Code:
awk -F,  'NF!=3{print "Error at line "NF;exit}' $TEMP_FILE

In place of 3, put the count of your no. of expected fields.

Guru.
I guess NF should not be equal to a specified number. You can use
Code:
awk -F"," -v l=$no_of_expected_fields 'NF!=l{ print "Error at line "NR; exit}' TEMP_FILE

if you need a variable to be compared

Last edited by PikK45; 04-04-2013 at 09:23 AM.. Reason: Guru the great! :)
# 4  
Old 04-04-2013
Quote:
Originally Posted by PikK45
Don't cat a very huge file.
cat-ing small files are the biggest problem, really. For a huge file, the overhead of running cat doesn't matter all that much. But running cat 10,000 times to process 10,000 tiny files will slow it down a lot, the same way it takes longer to say a sentence if you must make a separate phone call for each word.
# 5  
Old 04-04-2013
Quote:
Originally Posted by Anupam_Halder
Hi All,

I am checking whether each line is having "n" number of commas or nor. In case not then I need to exit the process.

I tried

Code:
cat "$TEMP_FILE" | while read LINE
do 
	processing_line=`expr $processing_line + 1`
	no_of_delimiters=`echo "$LINE" | awk -F ',' '{ print NF }'`
	if [ $no_of_delimiters -ne $no_of_expected_fields ]
                echo "Error at line $processing_line"
		exit
	fi
done

It's working fine. However the number of records in the file is around .5 million. So it's taking too much time to process it. Anyway I can improve the performance?

Thanks in advanced.
Note that the variable names $no_of_expected_fields and $no_of_delimiters are not representative of what awk actually does. When you aren't using the default awk field separator (<space>), every occurrence of the field separator separates two fields; it doesn't terminate a field. So for every non-empty line read by awk when the value of FS is a comma (such as by having -F, on the command line), the value of NF (Number of Fields) is the number of delimiters plus 1; not the number of delimiters.
If you want to print an error for any file that does not have $n commas on every line in the file, you need something like:
Code:
awk -F,  -v n="$n" 'NF!=(n-1){print "Error at line "NF;exit 1}' $TEMP_FILE
if [ $? -ne 0 ]
then    exit
fi
# Continue processing $TEMP_FILE...

in your script.

As always, if you are using a Solaris/SunOS system, use /usr/xpg4/bin/awk, /usr/xpg6/bin/awk, or nawk instead of awk.
Login or Register to Ask a Question

Previous Thread | Next Thread

10 More Discussions You Might Find Interesting

1. Shell Programming and Scripting

Cutting commas after the second occurrence in a line

Hello everyone, I am manipulating a large CSV file and am trying to read it into a program and started running into trouble. The have manually edited the file trying to make it correctly run through the program and have made progress. However, I am know stuck with an issue involving too many... (3 Replies)
Discussion started by: tastybrownies
3 Replies

2. Shell Programming and Scripting

Write $line number into textfile and read from line number

Hello everyone, I don't really know anything about scripting, but I have to manage to make this script, out of necessity. #!/bin/bash while read -r line; do #I'm reading from a big wordlist instructions using $line done Is there a way to automatically write the $line number the script... (4 Replies)
Discussion started by: bobylapointe
4 Replies

3. Shell Programming and Scripting

checking a number

ok im trying to find out how many cars a user enters. Its giving me an error message of "integer expression expected" Basically if i enter any number over 0 (zero) it should continue read -p "How many cars:" carsn test $cars -ge 1 test $? -ne 0 && read -p "Invalid number.... (8 Replies)
Discussion started by: gangsta
8 Replies

4. UNIX for Dummies Questions & Answers

How to read contents of a file from a given line number upto line number again specified by user

Hello Everyone. I am trying to display contains of a file from a specific line to a specific line(let say, from line number 3 to line number 5). For this I got the shell script as shown below: if ; then if ; then tail +$1 $3 | head -n $2 else ... (5 Replies)
Discussion started by: grc
5 Replies

5. Shell Programming and Scripting

Counting number of commas(,) in a variable

Hi all, I am having problems counting commas (,) from a variable in shell scripting.. the variable contains similiar to: ID@NAME@DESCRIPTION,ID@NAME@DESCRIPTION, ..... It can go on and on.. So i need to count the number of sets i.e.( ID@NAME@DESCRIPTION is one set) and process the... (4 Replies)
Discussion started by: faelric
4 Replies

6. UNIX for Dummies Questions & Answers

Inserting commas and replacing backslashes with commas

Hi, Newbie here. I have a file that consists of data that I want to convert to a csv file. For example: Jul 20 2008 1111 / visit home / BlackBerry8830/4.2.2 Profile/MIDP-2.0 Configuration/CLOC-1.1 VendorID/105 Jul 21 2008 22222 / add friend / BlackBerry8830/4.2.2 Profile/MIDP-2.0... (3 Replies)
Discussion started by: kangaroo
3 Replies

7. Shell Programming and Scripting

how to get the data from line number 1 to line number 100 of a file

Hi Everybody, I am trying to write a script that will get some perticuler data from a file and redirect to a file. My Question is, I have a Very huge file,In that file I have my required data is started from 25th line and it will ends in 100th line. I know the line numbers, I need to get all... (9 Replies)
Discussion started by: Anji
9 Replies

8. Shell Programming and Scripting

Adding a columnfrom a specifit line number to a specific line number

Hi, I have a huge file & I want to add a specific text in column. But I want to add this text from a specific line number to a specific line number & another text in to another range of line numbers. To be more specific: lets say my file has 1000 lines & 4 Columns. I want to add text "Hello"... (2 Replies)
Discussion started by: Ezy
2 Replies

9. Shell Programming and Scripting

Appending line number to each line and getting total number of lines

Hello, I need help in appending the line number of each line to the file and also to get the total number of lines. Can somebody please help me. I have a file say: abc def ccc ddd ffff The output should be: Instance1=abc Instance2=def Instance3=ccc Instance4=ddd Instance5=ffff ... (2 Replies)
Discussion started by: chiru_h
2 Replies

10. UNIX for Dummies Questions & Answers

sequence number checking

Hi there, I'm wanting to produce a shell script that will check through some file names and identify a skip in sequence (four digit seq num in file name). I have played on the idea of havng a file that has a sorted list of file names which I can read line at a time and cut out the sequence... (1 Reply)
Discussion started by: nhatch
1 Replies
Login or Register to Ask a Question