Change numbers in flat file


 
Thread Tools Search this Thread
Top Forums Shell Programming and Scripting Change numbers in flat file
# 15  
Old 11-29-2013
This awk script seems to do what you want:
Code:
awk '
# dpm(v1, v2) returns the maximum number of digits after the decimal point in
#	the two given values, or zero if one of the values does not contain ".".
# Note that this function only looks for a period to determine that the field
# should be processed.  If text appears in the file that contains a period in
# the last word on an input line, additional checks would need to be added to
# verify that the last word on the line is of the form "[-+]<digit>+.<digit>+"
# (i.e., an optional leading + or - followed by one or more decimal digits
# followed by a period followed by one or more decimal digits) and return zero
# if other characters are present or if the match is incomplete.
function dpm(v1, v2,	n1, n2, x1, x2) {
	if((x1 = index(v1, ".")) == 0 || (x2 = index(v2, ".")) == 0) return(0)
	n1 = length(v1) - x1	# # of digits after "." in v1.
	n2 = length(v2) - x2	# # of digits after "." in v2.
	return(n1 > n2 ? n1 : n2)	# return max(n1, n2).
}
NR == FNR {	# Get values from the value list file:
	# The values can be on separate lines, or space or tab separated on one
	# or more lines.
	for(i = 1; i <= NF; i++) vlist[++vlc] = $i
	next
}
p = dpm($NF, vlist[v + 1]) {
	# We are working on the 2nd input file and have skipped this clause if
	# the last field on this line does not contain a decimal point or if we
	# have run out of variables from the 1st input file.
	# The ++v on the next lines consumes the next value in vlist.
	if(vlist[++v] + 0 != $NF) {
		# We only get here if the last field on this line contains a
		# period and the next value in vlist[] is different from the
		# last field on this line.
		# Add a new field showing the difference between the values
		# printed with the maximum number of digits after the decimal
		# point that appeared in vlist[] and the last field on the
		# original input line.
		$(NF + 1) = sprintf("%+.*f", p, vlist[v] - $NF)
		# Replace the original last field with the number from the value
		# from vlist[].
		$(NF - 1) = vlist[v]
	}
}
1	# Print the (possibly updated) line from the 2nd input file.
' varnw1 file1 > file2

It was tested with the file varnw1 containing:
Code:
20.90000 21.45000 28.25000 29.35000 1.75000 2.75000 8.30000 9.80000 1.61140
1.66910 0.91310 0.89940 102.88000 101.40000 1.35200 1.36900 12.25710 7.60000
2.58430 2.13270 4508.34000 4318.94000 2.73230 2.73230000 16097.33000 1219.47000
1219.47000

but you could put each value on a different line or all values on a single line. I added three additional zeros to the end of the value shown in red so you can see what happens when the number of digits after the decimal point in a value in this file is larger than the corresponding value in file1 and the numeric values of the two numbers are different.

When file1 contains:
Code:
Values
Data 20.90000
Data1 21.45000
Value 1
Data 28.25000
Data1 29.35000
Value 2
Data 1.75000
Data1 2.75000
Value 3
Data1 8.30000
Data2 9.80000
Values data text in the text with in
Data1 1.61140
Data2  1.66910
Values text in the text
Data1 0.91310
Data2 0.89940
Data in the text
Data1 102.88000
Data2 101.40000
Data in the example
Data1 1.35200
Data2 1.36900
Text example data with text
Data1 12.25710
Data2 7.60000
Data with text
Data1 2.58430
Data2 2.13270
Data of the text
Data1 4508.34000
Data2 4318.94000
Values in data
Data1 2.73300
Data2 2.73300
Data in the text
Data1
Data2 16097.33000
Data in all text
Data1 1219.47000
Data2 1219.47000

the following will be written to file2:
Code:
Values
Data 20.90000
Data1 21.45000
Value 1
Data 28.25000
Data1 29.35000
Value 2
Data 1.75000
Data1 2.75000
Value 3
Data1 8.30000
Data2 9.80000
Values data text in the text with in
Data1 1.61140
Data2  1.66910
Values text in the text
Data1 0.91310
Data2 0.89940
Data in the text
Data1 102.88000
Data2 101.40000
Data in the example
Data1 1.35200
Data2 1.36900
Text example data with text
Data1 12.25710
Data2 7.60000
Data with text
Data1 2.58430
Data2 2.13270
Data of the text
Data1 4508.34000
Data2 4318.94000
Values in data
Data1 2.73230 -0.00070
Data2 2.73230000 -0.00070000
Data in the text
Data1
Data2 16097.33000
Data in all text
Data1 1219.47000
Data2 1219.47000

with the only two lines changed by the script marked in red. You will note that when the value in both files had 5 digits after the decimal point, the difference is shown with 5 digits after the decimal point and when the value in the 1st file had 8 digits after the decimal point and the value in the 2nd file had 5 digits after the decimal point, the difference is shown with 8 digits after the decimal point. The number of digits printed is determined by the value returned by the dpm() (decimal point max) function

I hope the comments I added to the script will enable you to figure out what it is doing. If it doesn't ask questions.
# 16  
Old 12-03-2013
Quote:
Originally Posted by Don Cragun
This awk script seems to do what you want:
Code:
awk '
# dpm(v1, v2) returns the maximum number of digits after the decimal point in
#    the two given values, or zero if one of the values does not contain ".".
# Note that this function only looks for a period to determine that the field
# should be processed.  If text appears in the file that contains a period in
# the last word on an input line, additional checks would need to be added to
# verify that the last word on the line is of the form "[-+]<digit>+.<digit>+"
# (i.e., an optional leading + or - followed by one or more decimal digits
# followed by a period followed by one or more decimal digits) and return zero
# if other characters are present or if the match is incomplete.
function dpm(v1, v2,    n1, n2, x1, x2) {
    if((x1 = index(v1, ".")) == 0 || (x2 = index(v2, ".")) == 0) return(0)
    n1 = length(v1) - x1    # # of digits after "." in v1.
    n2 = length(v2) - x2    # # of digits after "." in v2.
    return(n1 > n2 ? n1 : n2)    # return max(n1, n2).
}
NR == FNR {    # Get values from the value list file:
    # The values can be on separate lines, or space or tab separated on one
    # or more lines.
    for(i = 1; i <= NF; i++) vlist[++vlc] = $i
    next
}
p = dpm($NF, vlist[v + 1]) {
    # We are working on the 2nd input file and have skipped this clause if
    # the last field on this line does not contain a decimal point or if we
    # have run out of variables from the 1st input file.
    # The ++v on the next lines consumes the next value in vlist.
    if(vlist[++v] + 0 != $NF) {
        # We only get here if the last field on this line contains a
        # period and the next value in vlist[] is different from the
        # last field on this line.
        # Add a new field showing the difference between the values
        # printed with the maximum number of digits after the decimal
        # point that appeared in vlist[] and the last field on the
        # original input line.
        $(NF + 1) = sprintf("%+.*f", p, vlist[v] - $NF)
        # Replace the original last field with the number from the value
        # from vlist[].
        $(NF - 1) = vlist[v]
    }
}
1    # Print the (possibly updated) line from the 2nd input file.
' varnw1 file1 > file2

It was tested with the file varnw1 containing:
Code:
20.90000 21.45000 28.25000 29.35000 1.75000 2.75000 8.30000 9.80000 1.61140
1.66910 0.91310 0.89940 102.88000 101.40000 1.35200 1.36900 12.25710 7.60000
2.58430 2.13270 4508.34000 4318.94000 2.73230 2.73230000 16097.33000 1219.47000
1219.47000

but you could put each value on a different line or all values on a single line. I added three additional zeros to the end of the value shown in red so you can see what happens when the number of digits after the decimal point in a value in this file is larger than the corresponding value in file1 and the numeric values of the two numbers are different.

When file1 contains:
Code:
Values
Data 20.90000
Data1 21.45000
Value 1
Data 28.25000
Data1 29.35000
Value 2
Data 1.75000
Data1 2.75000
Value 3
Data1 8.30000
Data2 9.80000
Values data text in the text with in
Data1 1.61140
Data2  1.66910
Values text in the text
Data1 0.91310
Data2 0.89940
Data in the text
Data1 102.88000
Data2 101.40000
Data in the example
Data1 1.35200
Data2 1.36900
Text example data with text
Data1 12.25710
Data2 7.60000
Data with text
Data1 2.58430
Data2 2.13270
Data of the text
Data1 4508.34000
Data2 4318.94000
Values in data
Data1 2.73300
Data2 2.73300
Data in the text
Data1
Data2 16097.33000
Data in all text
Data1 1219.47000
Data2 1219.47000

the following will be written to file2:
Code:
Values
Data 20.90000
Data1 21.45000
Value 1
Data 28.25000
Data1 29.35000
Value 2
Data 1.75000
Data1 2.75000
Value 3
Data1 8.30000
Data2 9.80000
Values data text in the text with in
Data1 1.61140
Data2  1.66910
Values text in the text
Data1 0.91310
Data2 0.89940
Data in the text
Data1 102.88000
Data2 101.40000
Data in the example
Data1 1.35200
Data2 1.36900
Text example data with text
Data1 12.25710
Data2 7.60000
Data with text
Data1 2.58430
Data2 2.13270
Data of the text
Data1 4508.34000
Data2 4318.94000
Values in data
Data1 2.73230 -0.00070
Data2 2.73230000 -0.00070000
Data in the text
Data1
Data2 16097.33000
Data in all text
Data1 1219.47000
Data2 1219.47000

with the only two lines changed by the script marked in red. You will note that when the value in both files had 5 digits after the decimal point, the difference is shown with 5 digits after the decimal point and when the value in the 1st file had 8 digits after the decimal point and the value in the 2nd file had 5 digits after the decimal point, the difference is shown with 8 digits after the decimal point. The number of digits printed is determined by the value returned by the dpm() (decimal point max) function

I hope the comments I added to the script will enable you to figure out what it is doing. If it doesn't ask questions.
Don, thanks for your reply, I haved tested it and the result it`s ok for the script but the values are wrong.
The original value #23 2.73300 it`s operated with 2.73230 equals +0.00070
same with the #24 with must be +0.00070000

Thanks again.
# 17  
Old 12-03-2013
It is interesting to note that if we put the values:
Code:
54.24425 35.24535  25.53231 24.24225 35.24535 28.350132

in varnw1 (the values shown in the current version of the 1st message in this thread) and we put the sample data from that message in file1, the results my script puts in file2 is:
Code:
Data 1
Value One 54.24425 +29.99999
Value Two 35.24535 -10.00000
Other values in the list 25.53231 -59.71304

Data 2
Value One 24.24225 -20.00000
Value Two 35.24535 +10.00000
Other values of new 28.350132 -26.785328

which matches what you said you wanted in that message.

If you decided to switch the order of the operands in the subtraction in your calculations and didn't bother telling us that you had done so, it is not a bug in the script! If want to change your requirements again, have the courage to say that you want me to change the script again to meet your new requirements. Don't tell me there is a bug in my script when it behaves exactly as you said it should behave. Smilie

I gave you a fully commented script that tried to explain each action it was taking. Can you guess which single line in my script needs to change to meet your new requirements?
# 18  
Old 12-03-2013
Quote:
Originally Posted by Don Cragun
It is interesting to note that if we put the values:
Code:
54.24425 35.24535  25.53231 24.24225 35.24535 28.350132

in varnw1 (the values shown in the current version of the 1st message in this thread) and we put the sample data from that message in file1, the results my script puts in file2 is:
Code:
Data 1
Value One 54.24425 +29.99999
Value Two 35.24535 -10.00000
Other values in the list 25.53231 -59.71304

Data 2
Value One 24.24225 -20.00000
Value Two 35.24535 +10.00000
Other values of new 28.350132 -26.785328

which matches what you said you wanted in that message.
Don, you are right here again,the script meet with the requirment of the first message in this thread.
Quote:
If you decided to switch the order of the operands in the subtraction in your calculations and didn't bother telling us that you had done so, it is not a bug in the script! If want to change your requirements again, have the courage to say that you want me to change the script again to meet your new requirements. Don't tell me there is a bug in my script when it behaves exactly as you said it should behave. Smilie
I haved noted this in the last run of the script, I appreciate your help so do not be mad.

Quote:
I gave you a fully commented script that tried to explain each action it was taking.
Thanks a lot for that... im understanding a bit more now.. but I still haven`t the knowgledge to make changes.

Quote:
Can you guess which single line in my script needs to change to meet your new requirements?
No I haved tryed but I donīt know.
# 19  
Old 12-03-2013
Try changing the line in my script that is currently:
Code:
        $(NF + 1) = sprintf("%+.*f", p, vlist[v] - $NF)

to:
Code:
        $(NF + 1) = sprintf("%+.*f", p, $NF - vlist[v])

# 20  
Old 12-03-2013
Quote:
I gave you a fully commented script that tried to explain each action it was taking. Can you guess which single line in my script needs to change to meet your new requirements?
I haved test replacing in the script:
Code:
if(vlist[++v] + 0 != $NF) {

for
Code:
if(vlist[++v] - 0 != $NF) {

It`s right?

thanks
# 21  
Old 12-03-2013
@faka
It's also worth if you click 'thank' button in right side, for useful post, if your requirement is keep on changing or real input is other than sample input you provided, obviously our code will not work, it doesn't mean that bug in code. Please be more specific about requirement while posting a thread, I personally feel it just kills time of person who comes here to help, and also please note that here all are hard working volunteers, please give importance to their time also.

Regards,
Akshay
Login or Register to Ask a Question

Previous Thread | Next Thread

10 More Discussions You Might Find Interesting

1. Shell Programming and Scripting

Change numbers

Hallo This is the content of the file 3 4 5 6 7 8 9 10 11 12 And I want the following output 1 2 3 4 5 6 7 (4 Replies)
Discussion started by: thailand
4 Replies

2. Shell Programming and Scripting

Change the numbers

Hi friends, i need a command which can be used to change the values in file. I have file which contain some values. Data_Center_costing=XXXX Financial_loss=XXXX operational_cost=XXX I am not aware about the values of XXXX, They may be 4 digit or less/more than that. but i need these... (12 Replies)
Discussion started by: Nakul_sh
12 Replies

3. Shell Programming and Scripting

Adding (as in arithmetic) to numbers in columns in file, and writing new file with new numbers

Hi again. Sorry for all the questions — I've tried to do all this myself but I'm just not good enough yet, and the help I've received so far from bartus11 has been absolutely invaluable. Hopefully this will be the last bit of file manipulation I need to do. I have a file which is formatted as... (4 Replies)
Discussion started by: crunchgargoyle
4 Replies

4. Shell Programming and Scripting

Trying to take file numbers from a file, pass them to sed to change strings in corresponding lines

I have a bunch of file numbers in the file 'test': I'm trying the above command to change all the instances of "H" to "Na+" in the file testsds.pdb at the line numbers indicated in the file 'test'. I've tried the following and various similar alternatives but nothing is working: cat test |... (3 Replies)
Discussion started by: crunchgargoyle
3 Replies

5. Shell Programming and Scripting

How to change the format of the date column in a flat file?

Hi, i have a flat file namely temp.txt with this data below ID|name|contact_date 101|Kay|2013-12-26 102|let|2013-12-26 I need to modify the date data in the flat file into MM/DD/YYYY HH24:MI:SS format let me know the code for this. Thank you! (5 Replies)
Discussion started by: srikanth_sagi
5 Replies

6. Shell Programming and Scripting

Regarding change in column numbers after some commands

Hi All, I was using some commands to: replace a column by a constant string character awk -v a=CA 'NF>1{ $3=a; print; } ' $line>$line"_1" to copy a column and paste it in another place awk '$5=$2" "$5' $line>$line"_2" to delete the extra columns awk '{for(i=1;i<=NF;i++)... (9 Replies)
Discussion started by: CAch
9 Replies

7. Shell Programming and Scripting

Change numbers in a file, incrementing them

Hello, Here's a file of mine: key1:431 key2:159 key3:998 I need to change these keys to something bigger - and I actually need to shift them all by a range of 3. The output would be: key1:434 key2:162 key3:1001 I can't find the propper sed/awk line that would alter all my... (4 Replies)
Discussion started by: fzd
4 Replies

8. Shell Programming and Scripting

read numbers from file and output which numbers belongs to which range

Howdy experts, We have some ranges of number which belongs to particual group as below. GroupNo StartRange EndRange Group0125 935300 935399 Group2006 935400 935476 937430 937459 Group0324 935477 935549 ... (6 Replies)
Discussion started by: thepurple
6 Replies

9. AIX

how do I change major-minor numbers of disk devices

Good evening ... does anyone of you know how to change major/minor numbers of disk devices ? I had to migrate from raid1 to raid5 and this messed up my ASM cluster - I know which devices should have which IDs to match the content - but I have no idea how to change it. Any help would be... (2 Replies)
Discussion started by: zxmaus
2 Replies

10. Shell Programming and Scripting

script to change filename with numbers

ok, this one is definitely too hard for my shell-script-skills. Hopefully, there is somebody who can help me with this: I have a folder with files in it named 0.ppm 10.ppm 2.ppm ... 5.ppm 50.ppm 55.ppm ... 355.ppm 360.ppm etc. As you will notice, the order in which the files are... (5 Replies)
Discussion started by: silversurfer202
5 Replies
Login or Register to Ask a Question