Create a XML file for each row from the csv file


 
Thread Tools Search this Thread
Top Forums UNIX for Beginners Questions & Answers Create a XML file for each row from the csv file
# 1  
Old 04-05-2019
Create a XML file for each row from the csv file

I have a CSV file that looks like this:

Code:
File,Name,birthdate,Amount
File1.xml,Name1,01.02.19,1000
File2.xml,Name2	01.02.20,1000
File3.xml,Name3,01.02.21,1000

I need it to turn it into an XML file for each row, My ultimate goal is for the File1.xml look like this:

Code:
<?xml version="1.0" encoding="UTF-8"?>
<!DOCTYPE properties SYSTEM "http://java.sun.com/dtd/properties.dtd">
  <properties>
    <entry key="cm:name">Name1</entry>
    <entry key="cm:birthdate">1899-12-30</entry>
    <entry key="cm:amount">$1,000.00</entry>	
  </properties>

Not this:

Code:
<?xml version="1.0" encoding="UTF-8"?>
<!DOCTYPE properties SYSTEM "http://java.sun.com/dtd/properties.dtd">
<properties>
	<name>Name1</name>
	<birthdate>1899-12-30</birthdate>
	<amount>$1,000.00</amount>
</properties>

This will use for Content Bulk import and as metadata file in alfresco:
Preparing the Source Content . pmonks/alfresco-bulk-import Wiki . GitHub

Last edited by lxdorney; 04-06-2019 at 11:47 AM..
# 2  
Old 04-05-2019
something to start with: awk -F, -f lx.awk myFile.csv
where lx.awk is:
Code:
BEGIN {
  qq="\""
}
FNR==1{
  for(i=1;i<=NF;i++)
    tags[i]=tolower($i)
  print "<?xml version=" qq "1.0" qq "encoding=" qq "UTF-8" qq "?>\n<!DOCTYPE properties SYSTEM " qq "http://java.sun.com/dtd/properties.dtd" qq ">"
}
{
  print "\t<properties>"
  for(i=2;i<=NF;i++)
    printf("\t\t<entry key=%scm=%s%s>%s</entry\n", qq, tags[i], qq, $i)
  print "\t</properties>"
}

This User Gave Thanks to vgersh99 For This Post:
# 3  
Old 04-05-2019
@vgersh99 thank you so much for the response and your time spent to figure out my problem, the output is looking good there have little modifications and need to be done, BTW this will use as a properties of a file or what we could metadata that will bulk import to alfresco using alfresco-bulk-import.
Code:
BEGIN {
  qq="\""
}
FNR==1{
  for(i=1;i<=NF;i++)
    tags[i]=tolower($i)
  print "<?xml version=" qq "1.0" qq "encoding=" qq "UTF-8" qq "?>\n<!DOCTYPE properties SYSTEM " qq "http://java.sun.com/dtd/properties.dtd" qq ">"
}
{
  print "\t<properties>"
  for(i=2;i<=NF;i++)
    printf("\t\t<entry key=%scm:%s%s>%s</entry>\n", qq, tags[i], qq, $i)
  print "\t</properties>"
}

Here's the ouput:
Code:
<?xml version="1.0"encoding="UTF-8"?>
<!DOCTYPE properties SYSTEM "http://java.sun.com/dtd/properties.dtd">
        <properties>
                <entry key="cm:name">Name</entry>
                <entry key="cm:birthdate">Birthdate</entry>
                <entry key="cm:amount">Amount</entry>
        </properties>
        <properties>
                <entry key="cm:name">Name1</entry>
                <entry key="cm:birthdate">01.02.19</entry>
                <entry key="cm:amount">1000</entry>
        </properties>
        <properties>
                <entry key="cm:name">Name2</entry>
                <entry key="cm:birthdate">01.02.20</entry>
                <entry key="cm:amount">1000</entry>
        </properties>
        <properties>
                <entry key="cm:name">Name3</entry>
                <entry key="cm:birthdate">01.02.21</entry>
                <entry key="cm:amount">1000</entry>
        </properties>

My expected output each row from CSV file which the first column with red color is the filename this will look like this:
The first column is the actual filename
Code:
File,Name,Birthdate,Amount
File1.xml,Name1,01.02.19,1000
File2.xml,Name2,01.02.20,1000
File3.xml,Name3,01.02.21,1000

File1.xml -- File2.xml and so on . . .
Code:
<?xml version="1.0"encoding="UTF-8"?>
<!DOCTYPE properties SYSTEM "http://java.sun.com/dtd/properties.dtd">
        <properties>
                <entry key="cm:name">Name1</entry>
                <entry key="cm:birthdate">01.02.19</entry>
                <entry key="cm:amount">1000</entry>
        </properties>

And is possible if there is a 5th,6th,7th,8th and 9th column, remove the cm: so the output of each xml file like this:
Code:
<?xml version="1.0"encoding="UTF-8"?>
<!DOCTYPE properties SYSTEM "http://java.sun.com/dtd/properties.dtd">
        <properties>
                <entry key="cm:name">Name3</entry>
                <entry key="cm:birthdate">01.02.21</entry>
                <entry key="cm:amount">1000</entry>
                <entry key="separator"></entry>
                <entry key="namespace"></entry>
                <entry key="parentAssociation"></entry>			
                <entry key="type"></entry>
                <entry key="aspects"></entry>					
        </properties>

CSV File:
Code:
File,Name,birthdate,Amount,separator,namespace,parentAssociation,type,aspects
File1.xml,Name1,01.02.19,1000,,,,
File2.xml,Name2,01.02.20,1000,,,,
File3.xml,Name3,01.02.21,1000,,,,


Last edited by lxdorney; 04-06-2019 at 11:54 AM..
# 4  
Old 04-06-2019
Quote:
Originally Posted by lxdorney
And is possible if there is a 5th,6th,7th,8th and 9th column, remove the cm: so the output of each xml file like this:
Code:
<?xml version="1.0"encoding="UTF-8"?>
<!DOCTYPE properties SYSTEM "http://java.sun.com/dtd/properties.dtd">
        <properties>
                <entry key="cm:name">Name3</entry>
                <entry key="cm:birthdate">01.02.21</entry>
                <entry key="cm:amount">1000</entry>
                <entry key="separator"></entry>
                <entry key="namespace"></entry>
                <entry key="parentAssociation"></entry>			
                <entry key="type"></entry>
                <entry key="aspects"></entry>					
        </properties>

CSV File:
Code:
File,Name,Birthdate,Amount,separator,namespace,parentAssociation,type,aspects
File1.xml,Name1,01.02.19,1000,,,,
File2.xml,Name2	01.02.20,1000,,,,
File3.xml,Name3,01.02.21,1000,,,,

Hello lxdorney,

Could you please try following, not tested it though.
Code:
awk -v FS="," 'BEGIN{
  qq="\""
}
FNR==1{
  for(i=1;i<=NF;i++)
    tags[i]=tolower($i)
  print "<?xml version=" qq "1.0" qq "encoding=" qq "UTF-8" qq "?>\n<!DOCTYPE properties SYSTEM " qq "http://java.sun.com/dtd/properties.dtd" qq ">"
}
{
  print "\t<properties>"
  for(i=2;i<=4;i++){
    printf("\t\t<entry key=%scm:%s%s>%s</entry>\n", qq, tags[i], qq, $i)
  }
  for(i=5;i<=NF;i++){
    printf("\t\t<entry key=%s%s%s>%s</entry>\n", qq, tags[i], qq, $i)
  }  
  print "\t</properties>"
}'    Input_file

NOTE: You haven't set FS="," which may cause issues since your Input_file is having delimiter as comma so I have added it as -v FS="," too in my code.

Thanks,
R. Singh

Last edited by RavinderSingh13; 04-06-2019 at 03:49 AM..
These 2 Users Gave Thanks to RavinderSingh13 For This Post:
# 5  
Old 04-06-2019
Didn't you "need it to turn it into an XML file for each row"? Try
Code:
awk -v FS="," '
FNR == 1        {split (tolower($0), tags)
                 next
                }
                {print "<?xml version=\"1.0\" encoding=\"UTF-8\"?>" > $1
                 print "<!DOCTYPE properties SYSTEM \"http://java.sun.com/dtd/properties.dtd\">" > $1

                 print "\t<properties>" > $1

                 CM = "cm:"
                 for (i=2; i<=NF; i++)  {printf ("\t\t<entry key=\"%s%s\">%s</entry>\n", CM, tags[i], $i) > $1
                                         if (i == 4) CM = ""
                                        }
                 print "\t</properties>" > $1
                }
'    file

and check the result files. Your first post was not quite consistent with "birthdate" and "birthday". Your sample file in post #3 has a comma separator too few.
These 2 Users Gave Thanks to RudiC For This Post:
# 6  
Old 04-06-2019
Thank so much guys, I really appreciate the time and effort to solved the problem.
Login or Register to Ask a Question

Previous Thread | Next Thread

10 More Discussions You Might Find Interesting

1. Shell Programming and Scripting

Row Count in .csv file

Hi, I have to find the count of rows starting with "E," in given a.csv file . Sample Data File. E,2333AED,A,MC3,25,31-MAY-18 E,2333AED,A,MC3,25,31-MAY-18 CYMC3 25AED 0000 E,2333CZK,A,MC3,25,31-MAY-18 CYMC3 25CZK 0000 E,2333EUR,A,MC3,25,31-MAY-18... (3 Replies)
Discussion started by: Prabhakar Y
3 Replies

2. Shell Programming and Scripting

Need to read a csv and create xml - Perl

I have a csv file like below. john,r2,testdomain1,john.r2@hex.com,DOMAINADMIN,testdomain1.dom maxwell,b2, testdomain1,maxwell.b2@hex.com,DOMAINADMIN,testdomain1.dom I would need the perl script to read the above csv and create an xml like below. <Users> ... (1 Reply)
Discussion started by: Tuxidow
1 Replies

3. UNIX for Advanced & Expert Users

Convert CSV file to nested XML file using UNIX/PERL?

we have a CSV which i need to convert to XML using Perl or Unix shell scripting. I was able to build this XML in oracle database. However, SQL/XML query is running for long time. Hence, I'm considering to write a Perl or shell script to generate this XML file. Basically need to build this XML... (3 Replies)
Discussion started by: laknar
3 Replies

4. Shell Programming and Scripting

Compare 2 files of csv file and match column data and create a new csv file of them

Hi, I am newbie in shell script. I need your help to solve my problem. Firstly, I have 2 files of csv and i want to compare of the contents then the output will be written in a new csv file. File1: SourceFile,DateTimeOriginal /home/intannf/foto/IMG_0713.JPG,2015:02:17 11:14:07... (8 Replies)
Discussion started by: refrain
8 Replies

5. Shell Programming and Scripting

Extract data from XML file and write in CSV file

Hi friend i have input as following XML file <?xml version="1.0"?> <Document xmlns="urn:iso:std:iso:20022:tech:xsd:camt.054.001.02"> <BkToCstmrDbtCdtNtfctn> <GrpHdr><MsgId>LBP-RDJ-TE000000-130042430010001001</MsgId><CreDtTm>2013-01-04T03:21:30</CreDtTm></GrpHdr>... (3 Replies)
Discussion started by: mohan sharma
3 Replies

6. Shell Programming and Scripting

Create xml file using a content from another xml file

I need to create a xml file(master.xml) with contents from another xml files(children). I have below list of xml files in a temporary location (C:/temp/xmls) 1. child1.xml 2. child2.xml Below is the content of the child1.xml & child2.xml files, child1.xml <root> <emp> ... (3 Replies)
Discussion started by: vel4ever
3 Replies

7. Shell Programming and Scripting

Convert XML file to CSV file

Hi Guys, I am new to Shell scripting and need to convert an XML files to a CSV file. My actual problem is that XML file loading is taking hours and I have decided to convert the XML structure to row based data in a CSV file. My XML file: Message846 can repeat within main loop and... (1 Reply)
Discussion started by: qamar.shahbaz
1 Replies

8. Shell Programming and Scripting

Extract data from an XML file & write into a CSV file

Hi All, I am having an XML tag like: <detail sim_ser_no_1="898407109001000090" imsi_1="452070001000090"> <security>ADM1=????</security> <security>PIN1=????</security> <security>PIN2=????</security> ... (2 Replies)
Discussion started by: ss_ss
2 Replies

9. Shell Programming and Scripting

How to Delete Last Row from .csv file in perl

Hi , I've a perl script to convert .xls to .csv .After conversion I want to delete first 28 and the last row from .csv file.Is there any efficent way to achive this both together. I'm deleting first 28 rows by using folllowing my perl code: exec " tail -n+28 infile.csv > outfile.csv ... (7 Replies)
Discussion started by: ajaypatil_am
7 Replies

10. UNIX for Dummies Questions & Answers

Delete first row of csv file

I have a csv file, which is > 2 Gigs. I need to BCP that file to Sybase db , but I cant upload that b'caz first row of the file is failing. ( having some errors probably.) I can manually insert the first line into db & then I can upload the rest of the data in file, if i can delete the first row. ... (2 Replies)
Discussion started by: kedar.mehta
2 Replies
Login or Register to Ask a Question