how to get the columns name and fields from this file.


 
Thread Tools Search this Thread
Top Forums Shell Programming and Scripting how to get the columns name and fields from this file.
# 15  
Old 06-12-2007
The problem is more complicated.
I'm not sure that your output example is valid, the lines 1 and 2 seem imcompletes (missing datas):
Code:
TMID|ORIG_NUM|QRM_RUN_TYP,1530|C0050140063|C,RUN_DATE,17203,17318 
TMID|ORIG_NUM|QRM_RUN_TYP,1530|C0050202418|C,RUN_DATE,17204,. 
TMID|ORIG_NUM|QRM_RUN_TYP,1530|C0050202418|C,COMM_SRC_TXT,Purcha.COR_MA.30 DAY,NONE 
Key_NUM,L0064723026,Key_NUM,FNL_PIPLN_EXIT_DT,17303,.

The difficulty is to locate colums in the input files.
For that i use the separating line under the columns headers :
Code:
key_NUM     || FNL_PIPLN FNL_PIPLN     Diff.    % Diff
            || _EXIT_DT  _EXIT_DT
___________ || _________ _________ _________ _________

The code:
Code:
$ cat extract.awk
#!/usr/bin/awk -f
# Awk File: extract2.awk

function trim(str) {
   gsub(/^[[:space:]]*|[[:space:]]*$/, "", str);
   return str;
}

function compress(str) {
   gsub(/[[:space:]]+/, "", str);
   return trim(str);
}

BEGIN {
   FS  = "|";
   OFS = ",";
}

{
   if (NF != 3 || $2 != "" || $1 $3 ~ /^[[:space:]]*$/) next;
}

#-------------------------------------------------------------------
# Get columns names
#-------------------------------------------------------------------

function initColsNames() {
   cols_left["count"]          = 0;
   cols_left["headers_count"]  = 0;
   cols_list_left              = "";

   cols_right["count"]         = 0
   cols_right["headers_count"] = 0;
   cols_list_right             = "";
   return;
}


function memoColsHeaders(headers, cols) {
   if (headers ~ /^[[:space:]]*$/) return;
   cols[++cols["headers_count"], "header"] = headers;
}

function getColsInfos(headers, cols,      count, used, p) {
   count = 0;
   used  = 0;
   while (p = match(headers, /[^[:space:]]+/)) {
      ++count;
      cols[count, "start"  ] = RSTART + used;
      cols[count, "length" ] = RLENGTH
      used += RSTART - 1 + RLENGTH
      headers = substr(headers, RSTART+RLENGTH);
   }
   cols["count"] = count;
}

function getColsNames(cols,    c, h, n) {
   for (c=1; c<=cols["count"]; c++) {
      n = "";
      for (h=1; h<=cols["headers_count"]; h++)
         n = n substr(cols[h, "header"], cols[c, "start"], cols[c, "length"]);
      gsub(/[[:space:]]/, "", n);
      cols[c, "name"] = n;
   }
}

function buildColsLists(list,     c, h) {
   for (c=1; c<=cols_left["count"]; c++)
      cols_list_left = cols_list_left (c==1 ? "" : "|") cols_left[c, "name"];
   cols_list_right = cols_right[1, "name"];
}

$3 ~ /Base[[:space:]]+(Compare|Value)/ {
   initColsNames();
   while (getline) {
      if ($3 ~ /^[[:space:]]*$/) continue;
      if ($3 ~ /^[_[:space:]]+$/) break;
      memoColsHeaders($1, cols_left);
      memoColsHeaders($3, cols_right);
   }
   getColsInfos($1, cols_left);
   getColsInfos($3, cols_right);
   getColsNames(cols_left);
   getColsNames(cols_right);
   buildColsLists();
   next;
}

#-------------------------------------------------------------------
# Get values
#-------------------------------------------------------------------

function getValuesLeft(values,    c, v) {
   for (c=1; c<=cols_left["count"]; c++) {
      v = substr(values, cols_left[c, "start"], cols_left[c, "length"]);
      cols_values_left = (c==1 ? "" : cols_values_left "|") trim(v);
   }
}

function getValuesRight(values) {
   for (c=1; c<=2; c++) {
      v = substr(values, cols_right[c, "start"], cols_right[c, "length"]);
      cols_values_right = (c==1 ? "" : cols_values_right OFS) trim(v);
   }
}

cols_list_left && cols_list_right  {
   getValuesLeft($1);
   getValuesRight($3);
   print cols_list_left,cols_values_left,cols_list_right,cols_values_right;
}

Input file :
Code:
$ cat base_compare.txt
                                      ||       Base    Compare
TM_ID       ORIG_NUM     QRM_RUN_TYP  ||   RUN_DATE   RUN_DATE      Diff.     % Diff
__________  ___________  ___________  ||  _________  _________  _________  _________
                                      ||
1530        C0050140063  C            ||      17203      17318   115.0000     0.6685
1530        C0050202418  C            ||      17204          .          .          .
                                      ||  Base Value           Compare Value
TM_ID       ORIG_NUM     QRM_RUN_TYP  ||  COMM_SRC_TXT          COMM_SRC_TXT
__________  ___________  ___________  ||  ____________________  ____________________
                                      ||
1530        C0050140063  C            ||  Purcha.COR_MA.30 DAY  NONE

            || Base Compare
key_NUM     || FNL_PIPLN FNL_PIPLN     Diff.    % Diff
            || _EXIT_DT  _EXIT_DT
___________ || _________ _________ _________ _________
            ||
L0051041044 ||     17303         .         .         .
L0053125829 ||     17303         .         .         .
L0053855466 ||     17303         .         .         .
L0061072864 ||     17303         .         .         .
L0061288205 ||     17303     17103         .         .
L0062343827 ||     17303     17203         .         .
L0062526348 ||     17303     23201         .         .
L0063725295 ||     17303         .         .         .
L0064004237 ||     17303         .         .         .
L0064723026 ||     17303         .         .         .
            || Base Compare
key_NUM     || INIT_WHSE INIT_WHSE     Diff.    % Diff
            || _AMT _AMT
___________ || _________ _________ _________ _________
            ||
L0051041044 ||  100.2340        12         .         .
L0053125829 ||   99.9700         .         .         .
L0053855466 ||   98.2790         .         .         .
L0061072864 ||   99.9990         .         .         .
L0061288205 ||   99.1050       134         .         .
L0062343827 ||   98.5080         .         .         .
L0062526348 ||  100.2340         .         .         .
L0063725295 ||   98.9560         .         .         .
L0064004237 ||   98.9560         .         .         .
L0064723026 ||   99.2930       213         .         .
            || Base Compare
key_NUM     || FNL_PIPLN FNL_PIPLN     Diff.    % Diff
            || _AMT _AMT
___________ || _________ _________ _________ _________
            ||
L0051041044 ||  100.2340         .         .         .
L0053125829 ||   99.9700         .         .         .
L0053855466 ||   98.2790         .         .         .
            || Base Value Compare Value
key_Num     || FNL_PIPLN_EXIT_RSN_C FNL_PIPLN_EXIT_RSN_C
            || D                    D
___________ || ____________________ ____________________
            ||
L0051041044 || C
L0053125829 || C
L0053855466 || C
L0061072864 || C
L0061288205 || C
L0062343827 || C
L0062526348 || C
L0063725295 || C
L0064004237 || C
L0064723026 || C
$

Output:
Code:
$ awk -f extract.awk base_compare.txt
TM_ID|ORIG_NUM|QRM_RUN_TYP,1530|C0050140063|C,RUN_DATE,17203,17318
TM_ID|ORIG_NUM|QRM_RUN_TYP,1530|C0050202418|C,RUN_DATE,17204,.
TM_ID|ORIG_NUM|QRM_RUN_TYP,1530|C0050140063|C,COMM_SRC_TXT,Purcha.COR_MA.30 DAY,NONE
key_NUM,L0051041044,FNL_PIPLN_EXIT_DT,17303,.
key_NUM,L0053125829,FNL_PIPLN_EXIT_DT,17303,.
key_NUM,L0053855466,FNL_PIPLN_EXIT_DT,17303,.
key_NUM,L0061072864,FNL_PIPLN_EXIT_DT,17303,.
key_NUM,L0061288205,FNL_PIPLN_EXIT_DT,17303,17103
key_NUM,L0062343827,FNL_PIPLN_EXIT_DT,17303,17203
key_NUM,L0062526348,FNL_PIPLN_EXIT_DT,17303,23201
key_NUM,L0063725295,FNL_PIPLN_EXIT_DT,17303,.
key_NUM,L0064004237,FNL_PIPLN_EXIT_DT,17303,.
key_NUM,L0064723026,FNL_PIPLN_EXIT_DT,17303,.
key_NUM,L0051041044,INIT_WHSE_AMT_AMT,100.2340,12
key_NUM,L0053125829,INIT_WHSE_AMT_AMT,99.9700,.
key_NUM,L0053855466,INIT_WHSE_AMT_AMT,98.2790,.
key_NUM,L0061072864,INIT_WHSE_AMT_AMT,99.9990,.
key_NUM,L0061288205,INIT_WHSE_AMT_AMT,99.1050,134
key_NUM,L0062343827,INIT_WHSE_AMT_AMT,98.5080,.
key_NUM,L0062526348,INIT_WHSE_AMT_AMT,100.2340,.
key_NUM,L0063725295,INIT_WHSE_AMT_AMT,98.9560,.
key_NUM,L0064004237,INIT_WHSE_AMT_AMT,98.9560,.
key_NUM,L0064723026,INIT_WHSE_AMT_AMT,99.2930,213
key_NUM,L0051041044,FNL_PIPLN_AMT_AMT,100.2340,.
key_NUM,L0053125829,FNL_PIPLN_AMT_AMT,99.9700,.
key_NUM,L0053855466,FNL_PIPLN_AMT_AMT,98.2790,.
key_Num,L0051041044,FNL_PIPLN_EXIT_RSN_CD,C,
key_Num,L0053125829,FNL_PIPLN_EXIT_RSN_CD,C,
key_Num,L0053855466,FNL_PIPLN_EXIT_RSN_CD,C,
key_Num,L0061072864,FNL_PIPLN_EXIT_RSN_CD,C,
key_Num,L0061288205,FNL_PIPLN_EXIT_RSN_CD,C,
key_Num,L0062343827,FNL_PIPLN_EXIT_RSN_CD,C,
key_Num,L0062526348,FNL_PIPLN_EXIT_RSN_CD,C,
key_Num,L0063725295,FNL_PIPLN_EXIT_RSN_CD,C,
key_Num,L0064004237,FNL_PIPLN_EXIT_RSN_CD,C,
key_Num,L0064723026,FNL_PIPLN_EXIT_RSN_CD,C,
$

Jean-Pierre.
# 16  
Old 06-13-2007
works Fine but problem with single '_' under score.

Thank you very much for your time.

Program works like champ but little problem is if the column has one character values "C" or something above that it has "ONE underscore" _ it is printing the first letter of the column only it is not pring the hole column name.

i have replace one under score or line with multiple it working fine.

output i'm getting

key_Num,L0051041044,FD,C,
key_Num,L0053125829,FD,C


input file:
|| Base Value Compare Value
key_Num || FNL_PIPLN_EXIT_RSN_C FNL_PIPLN_EXIT_RSN_C
|| D D
___________ || _ _
||
L0051041044 || C
L0053125829 || C
L0053855466 || C

Last edited by rnallamothu; 06-13-2007 at 08:02 PM..
Login or Register to Ask a Question

Previous Thread | Next Thread

10 More Discussions You Might Find Interesting

1. Shell Programming and Scripting

How to concatenate 2-columns by 2 -columns for a text file?

Hello, I want to concatenate 2-columns by 2-columns separated by colon. How can I do so? For example, I have a text file containing 6 columns separated by tab. I want to concatenate column 1 and 2; column 3 and 4; column 5 and 6, respectively, and put a colon in between. input file: 1 0 0 1... (10 Replies)
Discussion started by: huiyee1
10 Replies

2. UNIX for Advanced & Expert Users

Sort in fields date and columns.

Hi colleagues, I have this output, i need sort for 7th and 8th columns, the column 7th is date mm/dd/yyyy format and column 8th is time hh:mm:ss.number PRUEBA 1263 0007 1 0 7131292 03/25/2013 16:43:04.159976 PROCESS PRUEBA1 666 0146 1 0 11600064 ... (1 Reply)
Discussion started by: systemoper
1 Replies

3. Shell Programming and Scripting

Deleting all the fields(columns) from a .csv file if all rows in that columns are blanks

Hi Friends, I have come across some files where some of the columns don not have data. Key, Data1,Data2,Data3,Data4,Data5 A,5,6,,10,, A,3,4,,3,, B,1,,4,5,, B,2,,3,4,, If we see the above data on Data5 column do not have any row got filled. So remove only that column(Here Data5) and... (4 Replies)
Discussion started by: ks_reddy
4 Replies

4. Shell Programming and Scripting

How to parse fixed-width columns which may include empty fields?

I am trying to selectively display several columns from a db2 query, which gives me a fixed-width output (partial output listed here): --------- -------------------------- ------------ ------ 000 0000000000198012 702 29 000 0000000000198013 ... (9 Replies)
Discussion started by: ahsh79
9 Replies

5. UNIX for Advanced & Expert Users

Help in Deleting columns and Renaming Mutliple columns in a .Csv File

Hi All, i have a .Csv file in the below format startTime, endTime, delta, gName, rName, rNumber, m2239max, m2239min, m2239avg, m100016509avg, m100019240max, metric3min, m100019240avg, propValues 11-Mar-2012 00:00:00, 11-Mar-2012 00:05:00, 300.0, vma3550a, a-1_CPU Index<1>, 200237463, 0.0,... (9 Replies)
Discussion started by: mahi_mayu069
9 Replies

6. UNIX for Dummies Questions & Answers

Removing columns from a text file that do not have any values in second and third columns

I have a text file that has three columns. But at the end of the text file, there are trailing lines that have missing second and third columns: 4 0.04972604 KLHL28 4 0.0497332 CSTB 4 0.04979822 AIF1 4 0.04983331 DECR2 4 0.04990344 KATNB1 4 4 4 4 How can I remove the trailing... (3 Replies)
Discussion started by: evelibertine
3 Replies

7. Shell Programming and Scripting

Compare Fields from two text files using key columns

Hi All, I have two files to compare. Each has 10 columns with first 4 columns being key index together. The rest of the columns have monetary values. Using Perl, I want to read one file into hash; check for the key value availability in file 2; then compare the values in the rest of 6... (2 Replies)
Discussion started by: Sangtha
2 Replies

8. Web Development

split the fields in a column into 3 columns

Have a column "address" which is combination of city, region and postal code like. Format is : city<comma><space>region<space>postal code abc, xyz 123456 All these three city, region and postal code are not mandatory. There can be any one of the above. In that case a nell... (2 Replies)
Discussion started by: rakshit
2 Replies

9. Shell Programming and Scripting

how to convert fields from a text file to excel columns

i have this file which has the following contents: ,-0.3000 ,-0.3000 ,-0.3000 ,-0.9000 ,-0.9000 ,-0.9000 i would like to get this: -0.3-0.9-0.3-0.9-0.3-0.9 so far i am trying: awk '{for(i=1; i<=NF; i++) {printf("%f\n",$i)}}' test1 > test2 any help... (4 Replies)
Discussion started by: npatwardhan
4 Replies

10. Shell Programming and Scripting

Joining two files based on columns/fields

I've got two files, File1 and File2 File 1 has got combination of col1, col2 and col3 which comes on file2 as well, file2 does not get col4. Now based on col1, col2 and col3, I would like to get col4 from file1 and all the columns from file2 in a new file Any ideas? File1 ------ Col1 col2... (11 Replies)
Discussion started by: rudoraj
11 Replies
Login or Register to Ask a Question