Sponsored Content
Full Discussion: Field delimited data to XML
Top Forums Programming Field delimited data to XML Post 302878995 by Indalecio on Tuesday 10th of December 2013 04:17:57 AM
Old 12-10-2013
Ok I´ll drop an example to illustrate the sort of data we would be looking after.
Code:
With each occurence of Record_key_1 identifying a new object, all subsequent records referring to that object until a new object is defined.

Sample input:
Record_key_1 | A | B | C |
Record_key_2 | 0 | 1 |
Record_key_2 | 2 | 3 |
Record_key_2 | 4| 5 |
Record_key_3 | a | b | c | d |
Record_key_3 | e | f | g | h |
Record_key_1 | D | E | F |
Record_key_2 | 6 | 7 |
Record_key_2 | 8 | 9 |
Record_key_3 | i | j | k | l |

= 2 objects total

Sample output:
<Object>
   <Object Properties>
      <Properties_1>A</Properties_1>
      <Properties_2>B</Properties_2>
      <Properties_3>C</Properties_3>
   </Object Properties>
   <Sub-object>
      <Sub-object characteristics_1>0</Sub-object characteristics_1>
      <Sub-object characteristics_2>1</Sub-object characteristics_2>
   </Sub-object>
   <Sub-object>
      <Sub-object characteristics_1>2</Sub-object characteristics_1>
      <Sub-object characteristics_2>3</Sub-object characteristics_2>
   </Sub-object>
   <Sub-object>
      <Sub-object characteristics_1>4</Sub-object characteristics_1>
      <Sub-object characteristics_2>5</Sub-object characteristics_2>
   </Sub-object>
   <Object user>
      <User details_1>a</User details_1>
      <User details_2>b</User details_2>
      <User details_3>c</User details_3>
      <User details_4>d</User details_4>
   </Object user>
   <Object user>
      <User details_1>e</User details_1>
      <User details_2>f</User details_2>
      <User details_3>g</User details_3>
      <User details_4>h</User details_4>
   </Object user>
</Object>

<Object>
   <Object Properties>
      <Properties_1>D</Properties_1>
      <Properties_2>E</Properties_2>
      <Properties_3>F</Properties_3>
   </Object Properties>
   <Sub-object>
      <Sub-object characteristics_1>6</Sub-object characteristics_1>
      <Sub-object characteristics_2>7</Sub-object characteristics_2>
   </Sub-object>
   <Sub-object>
      <Sub-object characteristics_1>8</Sub-object characteristics_1>
      <Sub-object characteristics_2>9</Sub-object characteristics_2>
   </Sub-object>
   <Object user>
      <User details_1>i</User details_1>
      <User details_2>j</User details_2>
      <User details_3>k</User details_3>
      <User details_4>l</User details_4>
   </Object user>
</Object>

I´m wondering where to store the mapping configuration so it can be picked up by the program rather than relying on a large number of conditional statements based on the record key value and the field_ID to deduce the tags to use.

My initial approach was to break down the original input structure to form a file with one line per field comprising of the Record/Field ID and the value, then replace the Record/Field ID by it's corresponding XML tag and finally wrap it up all together. At this point awk should be able to deal with it. The question is how long it takes to process 100.000s of "objects".

About the comment on the fact picking a language comes down to personal preference, I mean I can not disagree with that but when you look at performance some languages will deal with this requirement faster than others. If it means I need to spend 0.5x more coding time to get there it's a price I can realistically pay for the benefits it gives us.
This User Gave Thanks to Indalecio For This Post:
 

9 More Discussions You Might Find Interesting

1. Shell Programming and Scripting

How to perfrom summation for particular delimited field?

Hi, Please help to share your thought about how to perfrom summation for particular delimited field, and output to the particular file based on -rw-r--r-- 1 abc other 3094 Oct 19 09:40 0132019832-ps5_online_cdrm.unl -rw-r--r-- 1 abc other 1588 Oct 19 09:47... (2 Replies)
Discussion started by: rauphelhunter
2 Replies

2. Shell Programming and Scripting

Count field frequency in a '|' delimited file

I have a large file with fields delimited by '|', and I want to run some analysis on it. What I want to do is count how many times each field is populated, or list the frequency of population for each field. I am in a Sun OS environment. Thanks, - CB (3 Replies)
Discussion started by: ChicagoBlues
3 Replies

3. Shell Programming and Scripting

insert a field into a tab delimited file

Hello, Can someone help me to do this with awk or sed? I have a file with multiple lines, each line has many fields separated with a tab. I would like to add one more field holding 'na' in between the first and second fields. old file looks like, 1, field1 field2 field3 ... 2, field1... (7 Replies)
Discussion started by: ssshen
7 Replies

4. Shell Programming and Scripting

Using AWK to parse a delimited field

Hi everyone! How can I parse a delimited field using AWK? For example, if I have lastName#firstName or lastName*firstName. I'd like an AWK script that would return lastName and then another that would return firstName? Is this possible? (13 Replies)
Discussion started by: Fatbob
13 Replies

5. Shell Programming and Scripting

Pad zeroes first field in a Delimited file

Need help. I tried using an awk command to pad zeroes. Unfortunately, the "|" pipe delimited character is gone when I tried to write the records to another file. awk -F \| ' {$1=sprintf("%06s", $1); print $0}' $CUSTFINAL2 > $CUSTFINAL3 BEFORE "KEYRECORD"|"SA ID"|"PER ID"|"SP ID"|"ACCT... (3 Replies)
Discussion started by: johnhips
3 Replies

6. Shell Programming and Scripting

Cgi to dump xml data from form input field

Hi All, I am trying to write a shell script which takes parse the web form find the input field and dump the data of that field into one xml file. The form looks like, <input type="button" id="btnSave" value="Save" onclick="saveXmlData()"/> <form name="submitForm"... (1 Reply)
Discussion started by: jdp
1 Replies

7. Shell Programming and Scripting

Remove Last field from a delimited file

Hi, I have a '~' delimited file and i want to remove the last field using awk. Please find the sample records below: 1428128~1~0~1100426~003~50220~005~14~0~194801~11~0~3~14~0~50419052335~0~0820652001~2~00653862 ~0~1~0~00126~1~20000110~20110423~R~ ~0~Z~1662.94~ ~002041~0045~Z~... (3 Replies)
Discussion started by: Arun Mishra
3 Replies

8. Shell Programming and Scripting

How can i comma-delimited last field in line?

Awk gurus, Greatly appreciate for any kind of assistance from the expert community Input line: abc,11.22.33.44,xyz,7-8-9-10 pqr,111.222.333.444,wxy,1-2-3 def,22.33.44.55,stu,7-8 used the gsub function below but it changes all of the "-" delimiter: awk 'gsub("-",",")' Desired... (4 Replies)
Discussion started by: ux4me
4 Replies

9. Shell Programming and Scripting

Replace field in the delimited file

Hi, I have the requirement similar to the one mentioned in the below thread. https://www.unix.com/unix-for-dummies-questions-and-answers/128155-search-replace-string-only-particular-column-delimited-file.html The only difference is that I need to change the field for row 1,2 and the last... (14 Replies)
Discussion started by: chetanojha
14 Replies
All times are GMT -4. The time now is 03:12 PM.
Unix & Linux Forums Content Copyright 1993-2022. All Rights Reserved.
Privacy Policy