Help with separating datatype, column name


 
Thread Tools Search this Thread
Top Forums Shell Programming and Scripting Help with separating datatype, column name
# 1  
Old 06-20-2012
Help with separating datatype, column name

Hi All,
I am new to unix but have a requirement wherein I need to separate datatype,length, and column name from input file which is of below format --

record
integer(10) empid;
string(25) name;
date("YYYY-MM-DD") dob;
decimal(10) salary;
end

now after getting datatype,its length and column name. I need to put them in some other file(which has standard format) on specific locations.

Can you please tell me the approach, inputs on this ?

Thanks
# 2  
Old 06-20-2012
Show the output you want, not just the input you have.
# 3  
Old 06-20-2012
Quote:
<SOURCEFIELD BUSINESSNAME ="" DATATYPE ="int" DESCRIPTION ="" FIELDNUMBER ="1" FIELDPROPERTY ="0" FIELDTYPE ="ELEMITEM" HIDDEN ="NO" KEYTYPE ="NOT A KEY" LENGTH ="10" LEVEL ="0" NAME ="empid" NULLABLE ="NULL" OCCURS ="0" OFFSET ="0" PHYSICALLENGTH ="10" PHYSICALOFFSET ="0" PICTURETEXT ="" PRECISION ="10" SCALE ="0" USAGE_FLAGS =""/>
<SOURCEFIELD BUSINESSNAME ="" DATATYPE ="string" DESCRIPTION ="" FIELDNUMBER ="2" FIELDPROPERTY ="0" FIELDTYPE ="ELEMITEM" HIDDEN ="NO" KEYTYPE ="NOT A KEY" LENGTH ="25" LEVEL ="0" NAME ="name" NULLABLE ="NULL" OCCURS ="0" OFFSET ="11" PHYSICALLENGTH ="10" PHYSICALOFFSET ="10" PICTURETEXT ="" PRECISION ="10" SCALE ="0" USAGE_FLAGS =""/>
<SOURCEFIELD BUSINESSNAME ="" DATATYPE ="string" DESCRIPTION ="" FIELDNUMBER ="2" FIELDPROPERTY ="0" FIELDTYPE ="ELEMITEM" HIDDEN ="NO" KEYTYPE ="NOT A KEY" LENGTH ="10" LEVEL ="0" NAME ="dob" NULLABLE ="NULL" OCCURS ="0" OFFSET ="11" PHYSICALLENGTH ="10" PHYSICALOFFSET ="10" PICTURETEXT ="" PRECISION ="10" SCALE ="0" USAGE_FLAGS =""/>
<SOURCEFIELD BUSINESSNAME ="" DATATYPE ="number" DESCRIPTION ="" FIELDNUMBER ="3" FIELDPROPERTY ="0" FIELDTYPE ="ELEMITEM" HIDDEN ="NO" KEYTYPE ="NOT A KEY" LENGTH ="10" LEVEL ="0" NAME ="salary" NULLABLE ="NULL" OCCURS ="0" OFFSET ="21" PHYSICALLENGTH ="10" PHYSICALOFFSET ="20" PICTURETEXT ="" PRECISION ="10" SCALE ="0" USAGE_FLAGS =""/>
I need to replace/put these values of datatype,its length,column name in above xml standard format.
Note:1] For date field-use string in xml (length wud be number of bytes in for eg.(YYMMDD)-here 6 ...if (YYYY-MM-DD) then 10
2] For field with decimal-use number in xml. Rest wud remain same

The number of fields wud change -here we have 4. Please guide me
# 4  
Old 06-20-2012
Where do things like 'offset' come from?
# 5  
Old 06-20-2012
Code:
$ cat r2xml.awk
BEGIN { F=0     }

{       gsub(/[();]/, " ");
        gsub(/['"]/, "");
        $1=$1
        gsub(/^integer/, "int");
        gsub(/^date/, "string");
        gsub(/^decimal/, "number");

        if(!($2 ~ /^[0-9]*$/)) $2=length($2);
}

NF==3 {
        printf("<SOURCEFIELD BUSINESSNAME=\"\" DATATYPE=\"%s\" "\
                "DESCRIPTION=\"\" FIELDNUMBER=\"%d\" "          \
                "FIELDPROPERTY=\"0\" FIELDTYPE=\"ELEMITEM\" "   \
                "HIDDEN=\"NO\" KEYTYPE=\"NOT A KEY\" "          \
                "LENGTH=\"%d\" LEVEL=\"0\" NAME=\"%s\" "        \
                "NULLABLE=\"NULL\" OCCURS=\"0\" OFFSET=\"0\" "  \
                "PHYSICALLENGTH=\"10\" PHYSICALOFFSET=\"0\" "   \
                "PICTURETEXT=\"\" PRECISION=\"10\" SCALE=\"0\" "\
                "USAGE_FLAGS=\"\"/>\n",
                $1, F++, $2, $3);
}

$ awk -f r2xml.awk data
<SOURCEFIELD BUSINESSNAME="" DATATYPE="int" DESCRIPTION="" FIELDNUMBER="0" FIELDPROPERTY="0" FIELDTYPE="ELEMITEM" HIDDEN="NO" KEYTYPE="NOT A KEY" LENGTH="10" LEVEL="0" NAME="empid" NULLABLE="NULL" OCCURS="0" OFFSET="0" PHYSICALLENGTH="10" PHYSICALOFFSET="0" PICTURETEXT="" PRECISION="10" SCALE="0" USAGE_FLAGS=""/>
<SOURCEFIELD BUSINESSNAME="" DATATYPE="string" DESCRIPTION="" FIELDNUMBER="1" FIELDPROPERTY="0" FIELDTYPE="ELEMITEM" HIDDEN="NO" KEYTYPE="NOT A KEY" LENGTH="25" LEVEL="0" NAME="name" NULLABLE="NULL" OCCURS="0" OFFSET="0" PHYSICALLENGTH="10" PHYSICALOFFSET="0" PICTURETEXT="" PRECISION="10" SCALE="0" USAGE_FLAGS=""/>
<SOURCEFIELD BUSINESSNAME="" DATATYPE="string" DESCRIPTION="" FIELDNUMBER="2" FIELDPROPERTY="0" FIELDTYPE="ELEMITEM" HIDDEN="NO" KEYTYPE="NOT A KEY" LENGTH="10" LEVEL="0" NAME="dob" NULLABLE="NULL" OCCURS="0" OFFSET="0" PHYSICALLENGTH="10" PHYSICALOFFSET="0" PICTURETEXT="" PRECISION="10" SCALE="0" USAGE_FLAGS=""/>
<SOURCEFIELD BUSINESSNAME="" DATATYPE="number" DESCRIPTION="" FIELDNUMBER="3" FIELDPROPERTY="0" FIELDTYPE="ELEMITEM" HIDDEN="NO" KEYTYPE="NOT A KEY" LENGTH="10" LEVEL="0" NAME="salary" NULLABLE="NULL" OCCURS="0" OFFSET="0" PHYSICALLENGTH="10" PHYSICALOFFSET="0" PICTURETEXT="" PRECISION="10" SCALE="0" USAGE_FLAGS=""/>

$

Login or Register to Ask a Question

Previous Thread | Next Thread

10 More Discussions You Might Find Interesting

1. Shell Programming and Scripting

Datatype,structure and dateformat checking.

I have a sourcefile which contains data as below.I want to check whether datatype,structure and date format looks good as mentioned. Data is delemited by cydila Ç. Source file-Emp.txt snoÇnameÇphonenoÇdeptÇjoineddate 1ÇvivekÇ0861ÇCSEÇ2013-05-29 00:00:00 2ÇdineshÇ123456ÇECEÇ2013-05-29 00:00:00... (8 Replies)
Discussion started by: katakamvivek
8 Replies

2. Shell Programming and Scripting

Datatype and length validation

I have sourcefile and structure of source file,i want to check whether datatype and length mention in emp.txt is same as source file. Example: in emp.txt first row contains sno number so in source file also first column should contain only number if data is other than number then that... (1 Reply)
Discussion started by: katakamvivek
1 Replies

3. Shell Programming and Scripting

Datatype file validation

I have a sourcefile which contains data as below.I want to check whether datatype,structure and date format looks good as mentioned. Data is delemited by cydila . Source file-Emp.txt sno name phoneno dept joineddate 1 vivek 0861 CSE 2013-05-29 00:00:00 2 dinesh 123456 ECE ... (2 Replies)
Discussion started by: katakamvivek
2 Replies

4. Programming

Understanding C++ template partial specialization with pointer datatype arguments

When I compile the below code, I am getting error as template<typename T> T AddFun(T i, T j) { return i + j; } template<> T* AddFun<T*>(T* i, T* j) { return new T(*i + *j); } int main() { int n = AddFun<int>(10, 20); int i = 10, j = 20; int* p = AddFun<int*>(&i,... (1 Reply)
Discussion started by: royalibrahim
1 Replies

5. Shell Programming and Scripting

Need help separating a file

Hi all, I have a single text file, Contig3.fasta, that looks like this: >NAME1 ACCTGGTA >NAME2 GGTTGGACA >NAME3 ATTTTGGGCCAnd It has about 100 items like this in it. What I would like to do is copy each item into 100 different text files, and have them named a certain way Output... (4 Replies)
Discussion started by: repiv
4 Replies

6. Programming

How to get the size of the datatype passed as the command line argumet?

#include <stdio.h> int main(int argc, char *argv) { printf("%d\n", sizeof(argv)); return 0; } when I run the executable a.out after compiling the above program as: a.out short (or) a.out "long double", I expected to get the output as 2 and 12, but I am always getting the size of... (2 Replies)
Discussion started by: royalibrahim
2 Replies

7. Shell Programming and Scripting

Separating fields

Hi, I have a text file in following format: 2.45 5.67 6.43 I have to cut the values before decimal and store them in a file. So the output file should look like: 2 5 6 . . and so on... Can someone suggest me a sed/awk command for doing this? (2 Replies)
Discussion started by: sajal.bhatia
2 Replies

8. Shell Programming and Scripting

Separating data from one column into two columns

Hello, I have a file that contains 64,235 columns and over 1000 rows and looks similar to this: ID dad mom 1 2 3 4 5.... 64232 1234 5678 6789 AA BB CC DD EE....ZZ 1342 5786 6897 BB CC DD EE FF....AA 1423 5867 6978 CC DD EE FF GG....BB I need to leave the first three columns in... (4 Replies)
Discussion started by: doobedoo
4 Replies

9. Shell Programming and Scripting

separating fields

Hi, i have a file as follows: jonathan:bonus1,bonus2 gerald:bonus1 patrick:bonus1,bonus2 My desired output is jonathan:bonus1 jonathan:bonus2 gerald:bonus1 patrick:bonus1 patrick:bonus2 my current code is cat $F | awk -F"" how should i continue the code? Can i do something... (5 Replies)
Discussion started by: new2ss
5 Replies

10. Programming

separating commands

hey there well i have a small problem with my code. when for example : " /bin/sleep 10 & ls -l mila > xyz " is entered, the program is supposed to separate the two commands 1) /bin/sleep 10 & and 2) ls -l mila > xyz. im not sure of how to achieve this. my current program stores both commands... (2 Replies)
Discussion started by: mile1982
2 Replies
Login or Register to Ask a Question