Fill in missing rows with zero to have uniform table


 
Thread Tools Search this Thread
Top Forums Shell Programming and Scripting Fill in missing rows with zero to have uniform table
# 15  
Old 03-24-2015
Quote:
Originally Posted by yifangt
[..] when I tried your expanded code from Scruntinizer's, the output is not sorted as I am expecting as my two original files are sorted in ascending order respectively:
Code:
S10    11298    30165    40361        S10    1290    41074    135810
S01    36407    53706    88540        S01    0    0    0
S11    18311    37266    135798        S11    0    0    0
S02    69343    87098    87316        S02    14644    37964    70990
S12    14567    35958    62691        S12    30731    60117    77166
S03    50133    59721    107923        S03    0    0    0
S04    0    0    0        S04    52922    84177    87225
S05    1290    41074    135810        S05    43427    68368    109344
S06    11285    30164    40361        S06    4212    15654    15664
S07    11285    30164    40361        S07    0    0    0
S08    0    0    0        S08    16257    41558    45595

[..]
As I wrote in post #2 (since the input is in sorted order) :
Quote:
If it is not in the right order you could pipe the output through sort..
Code:
awk '......' file1 file2 | sort

And your output should be in the right order..

Last edited by Scrutinizer; 03-24-2015 at 01:38 AM..
This User Gave Thanks to Scrutinizer For This Post:
# 16  
Old 03-24-2015
Thanks Scrutinizer!
I got that message about awk '......' file1 file2 | sort already. My discussion with Don is related but more about the array index in awk that I do not understand well enough.
My question is about this block
Code:
  {
    i=$1
    C[i]
    $1=""
  }

I thought $1="" is not needed, but it causes problem when removed, of course, as its purpose is to empty $1.
Code:
S01S01    36407    53706    88540        S01    0    0    0 
S02S02    69343    87098    87316        S02S02    14644    37964    70990 
S03S03    50133    59721    107923       S03    0    0    0
......

I had thought i = $1 changes every row similar to A[i]=$0 or B[i] = $0, where $0 does not need be emptied each row after assignment. How does $1="" work behind?
-----------------------------------------------------
Thanks Don!
Your explanation is more than enough!
Yes, I did try your code which gave me what I wanted. One more question about this block
Code:
FNR == 1 {
    f++
}

Does it mean I can have many files as I want?
For example, I want align 3 files like I did with the 2 files in the same way. I modified your code, and IT WORKs now!
Code:
awk -v n=12 '
FNR == 1 {
     f++
}
{    
    d[f, $1] = $0
}
END {    
    for(i = 1; i <=n ; i++) {
        id = sprintf("S%02d", i)
    for (j = 1; j <=f; j++) 
      {
          printf("%-37s",
                    ((j, id) in d) ? d[j,id] : id "    0    0    0")
        }
          printf("\n")
    }
}' file[12] file1

Here I print file1 twice for demo purpose
Code:
S01    36407    53706    88540        S01    0    0    0                    S01    36407    53706    88540       
S02    69343    87098    87316        S02    14644    37964    70990        S02    69343    87098    87316       
S03    50133    59721    107923       S03    0    0    0                    S03    50133    59721    107923      
S04    0    0    0                    S04    52922    84177    87225        S04    0    0    0                   
S05    1290    41074    135810        S05    43427    68368    109344       S05    1290    41074    135810       
S06    11285    30164    40361        S06    4212    15654    15664         S06    11285    30164    40361       
S07    11285    30164    40361        S07    0    0    0                    S07    11285    30164    40361       
S08    0    0    0                    S08    16257    41558    45595        S08    0    0    0                   
S09    0    0    0                    S09    0    0    0                    S09    0    0    0                   
S10    11298    30165    40361        S10    1290    41074    135810        S10    11298    30165    40361       
S11    18311    37266    135798       S11    0    0    0                    S11    18311    37266    135798      
S12    14567    35958    62691        S12    30731    60117    77166        S12    14567    35958    62691

In fact I have 44 files to be aligned side by side. By modifying the line printf("%-37s %s\n",....) with another loop multiple files can be aligned side by side. Any comments on this?
Thanks a lot!

Last edited by yifangt; 03-24-2015 at 07:33 PM..
# 17  
Old 03-24-2015
It looks like you're getting the concepts down. Here are a few comments to consider as you move forward:
  1. Please get in the habit of indenting your code in a manner that makes the structure of you code clear.
  2. Notice that your original output file format had 38 characters/file (which I encoded as %-37s , not as %-38s). This guarantees that if data in one file is wider than normal, a space will separate the output from that file from the output for the next file. If you want to truncate, rather than print all of the data from a wide file, use %37.37s .
  3. You said you have 44 files. That should not be a problem. With a nominal LINE_MAX value of 2048 and 38 characters/file , you can process up to 53 files and still have output that fits the requirements for a text file. (If you go over LINE_MAX bytes per output line, tools like awk, grep, sed, vi, etc. are not guaranteed to be able to process your output as input for further processing.)
  4. I have a personal pet peeve about writing unneeded spaces at the end of an output line. It makes the size of your output larger, but doesn't make any other difference unless you have some program that requires fixed length records that will be reading your output.
  5. If you're going to have varying numbers of file operands, put the pathnames for those files on your command line as operands and reference the entire list in your script using "$@".
  6. If you a variable (like n in your script), consider adding it as a command line operand or option.
If you find these comments useful, consider something like:
Code:
#@!/bin/ksh
IAm=${0##*/}
Usage="Usage: $IAm: rows file..."
if [ $# -lt 2 ]
then	printf '%s\n' "$Usage" >&2
	exit 1
fi
rows="$1"
if [ "$rows" = '' ] || [ "${rows##[[:digit:]]*}" != '' ]
then	printf '%s: rows must be numeric\n%s\n' "$IAm" "$Usage" >&2
	exit 2
fi
shift
awk -v rows=$rows '
FNR == 1 {
	f++
}
{	d[f, $1] = $0
}
END {	for(i = 1; i <= rows; i++) {
		id = sprintf("S%02d", i)
		for(j = 1; j <= f; j++)
			printf(j < f ? "%-37s" : "%s\n",
				(j, id) in d ? d[j, id] : id "    0    0    0")
	}
}' "$@"

This User Gave Thanks to Don Cragun For This Post:
# 18  
Old 03-24-2015
Thanks a lot!
Definitely those comments are very professional and I am trying to be. The alignment with space was not my original concerns. The part still bugs me is the block
Code:
 {     
  i=$1     
  C[i]     
  $1="" 
}

that I have started a new thread for. Thanks you very much for all your help!
Login or Register to Ask a Question

Previous Thread | Next Thread

9 More Discussions You Might Find Interesting

1. Shell Programming and Scripting

Fill in missing values

Hi, I have a data sample as shown below. I want to fill in the left column so that the line will be continuous. For example, between 1 and 5 should be 2,3,4. And corresponding values in the right column will be 0. Thus the expected data should look like that: 1 1 1 10 1 2 1 3 1 5 1 6 2 0... (6 Replies)
Discussion started by: theanh0508
6 Replies

2. UNIX for Beginners Questions & Answers

Fill in missing hours and interpolate values using awk.

I have a time series data like this 40754,35.6931,51.3092,201610160700,21.0 40754,35.6931,51.3092,201610160800,23.0 40754,35.6931,51.3092,201610160900,24.0 40754,35.6931,51.3092,201610161000,24.0 40754,35.6931,51.3092,201610161300,25.0 40754,35.6931,51.3092,201610161400,23.0... (6 Replies)
Discussion started by: emirzaei
6 Replies

3. Shell Programming and Scripting

How to get the missing date and day in a table?

Hi Am using unix Aix Ksh Have Created table called vv and i have inserted two date Select * from vv; Output :- New_date 21/02/2013 24/02/2013 I have tried Using One query but Unsuccessful so far.. SELECT l.new_date + '1 day' as miss from vv as l (7 Replies)
Discussion started by: Venkatesh1
7 Replies

4. Shell Programming and Scripting

Fill in missing Data

hello everyone, I have a task to input missing data into a file. example of my data below: Wed Feb 01 09:00:02 EST 2012,,,0.4,0.3,,0.3,,0.3,,0.5,,0.3,,,0.4,0.3, Wed Feb 01 09:00:11 EST 2012,,,,,,,0.2,,,,,,,,,, Wed Feb 01 09:00:22 EST... (23 Replies)
Discussion started by: Nolph
23 Replies

5. Shell Programming and Scripting

Fill missing values with 2

Hi All, I have 100 .txt files which look like this: 3 4 5 6 7 Now, some files have some numbers missing in them and they look like this: 4 5 6 (6 Replies)
Discussion started by: shoaibjameel123
6 Replies

6. Shell Programming and Scripting

Fill missing numbers in second column with zeros

Hi All, I have 100 files with names like this: 1.dat, 2.dat, 3.dat until 100.dat. My dat files look like this: 42323 0 438939 1 434 0 0.9383 3434 120.23 3 234 As you can see in the second column, some numbers are missing. I want to fill those missing places with 0's in all... (3 Replies)
Discussion started by: shoaibjameel123
3 Replies

7. Shell Programming and Scripting

Compare columns and rows with template, and fill empty slots.

Hi, I'm working on a script that will take the contents of a file, that is in a row and column format, and compare it to a arrangment file. Such that if there is any or all blanks in my content file, the blank will be filled with a flag and will retain the row and column configuration. Ex. ... (2 Replies)
Discussion started by: hizzle
2 Replies

8. Shell Programming and Scripting

fill in missing columns

It can't be that hard, but I just can't figure it out: I have file like: File Sub-brick M_1 S_1 M_2 S_2 M_4 S_4 ... xxx 1 214 731 228 621 132 578 ... and would like to get 0 0 where M_3 S_3 is missing xxx 1 214 731 228 621 0 0 132 578 ... I wrote following script, but can't figure out... (3 Replies)
Discussion started by: avvk
3 Replies

9. Shell Programming and Scripting

Fill in missing numbers in range

I need to edit a list of numbers on the following form: 1 1.0 2 1.4 5 2.1 7 1.9 I want: 1 1.0 2 1.4 3 0.0 4 0.0 5 2.1 6 0.0 7 1.9 (i want to add the missing number in column 1 together with 0.0 in column 2). I guess it is rather trivial but i didn't even manage to read column... (5 Replies)
Discussion started by: bistru
5 Replies
Login or Register to Ask a Question