Converting unstructured data to structured data


 
Thread Tools Search this Thread
Top Forums UNIX for Dummies Questions & Answers Converting unstructured data to structured data
# 1  
Old 06-05-2016
Converting unstructured data to structured data

Hi,

Can someone help in converting the below unstructured data to a CSV format please.

Code:
{
  "branchId" : "BNSFGDJNSJG-73264HB-132131BNHJFSDG",
  "branchName" : "NEWYORK-SSDF",
  "branchProductId" : "72Y5HFHSF7H3RUNAWEF",
  "PreferenceId" : "BASDBVcbzcYHcb",
  "emailId" : "hjbasf7hh4aBSDFU@mymail.com",
  "firstName" : "FNM",
  "lastName" : "LNM",
  "middleName" : "",
  "userId" : "LSDFJDSJMDFG238487ybsgfd"
}

Output required:

Code:
branchId,branchName,branchProductId,PreferenceId,emailId,firstName,lastName,middleName,userId
"BNSFGDJNSJG-73264HB-132131BNHJFSDG","NEWYORK-SSDF","72Y5HFHSF7H3RUNAWEF","BASDBVcbzcYHcb", "hjbasf7hh4aBSDFU@mymail.com", "FNM","LNM","","LSDFJDSJMDFG238487ybsgfd"

Thanks in advance
Naveen

Last edited by Don Cragun; 06-06-2016 at 01:29 AM.. Reason: Add CODE tags.
# 2  
Old 06-06-2016
Would this work?
Code:
perl -ne '@c=/(\w+)"\s+:\s+("[^"]+")/ and push @h, $c[0] and push @r, $c[1]; END{$"=","; print "@h\n@r\n"}' naveen.file

Output:
Code:
branchId,branchName,branchProductId,PreferenceId,emailId,firstName,lastName,userId
"BNSFGDJNSJG-73264HB-132131BNHJFSDG","NEWYORK-SSDF","72Y5HFHSF7H3RUNAWEF","BASDBVcbzcYHcb","hjbasf7hh4aBSDFU@mymail.com","FNM","LNM","LSDFJDSJMDFG238487ybsgfd"

This User Gave Thanks to Aia For This Post:
# 3  
Old 06-06-2016
Quote:
Originally Posted by Aia
Would this work?
Code:
perl -ne '@c=/(\w+)"\s+:\s+("[^"]+")/ and push @h, $c[0] and push @r, $c[1]; END{$"=","; print "@h\n@r\n"}' naveen.file

Output:
Code:
branchId,branchName,branchProductId,PreferenceId,emailId,firstName,lastName,userId
"BNSFGDJNSJG-73264HB-132131BNHJFSDG","NEWYORK-SSDF","72Y5HFHSF7H3RUNAWEF","BASDBVcbzcYHcb","hjbasf7hh4aBSDFU@mymail.com","FNM","LNM","LSDFJDSJMDFG238487ybsgfd"

Works like a treat, thanks Aia Smilie but why is the "middleName" field is missed ? is that because it doesn't contain any value.
# 4  
Old 06-06-2016
Code:
#!/bin/bash

header=(); values=()

while IFS=":" read var val
do
  if [[ "$var" =~ "\"" ]]; then

  var="${var//\"/}"; var="${var% }"
   [[ "${#header[@]}" -eq "0" ]] && header=("$var") || header=("${header[@]}","$var")

  val="${val# }"; val="${val%,}"
   [[ "${#values[@]}" -eq "0" ]] && values=("$val") || values=("${values[@]}","$val")

  fi

done <inputfile

printf "%s\n%s\n" "${header[@]}" "${values[@]}"

This User Gave Thanks to junior-helper For This Post:
# 5  
Old 06-06-2016
Quote:
Originally Posted by junior-helper
Code:
#!/bin/bash

header=(); values=()

while IFS=":" read var val
do
  if [[ "$var" =~ "\"" ]]; then

  var="${var//\"/}"; var="${var% }"
   [[ "${#header[@]}" -eq "0" ]] && header=("$var") || header=("${header[@]}","$var")

  val="${val# }"; val="${val%,}"
   [[ "${#values[@]}" -eq "0" ]] && values=("$val") || values=("${values[@]}","$val")

  fi

done <inputfile

printf "%s\n%s\n" "${header[@]}" "${values[@]}"

Thanks junior-helper. It works Smilie
# 6  
Old 06-06-2016
Quote:
Originally Posted by naveen.kuppili
Works like a treat, thanks Aia Smilie but why is the "middleName" field is missed ? is that because it doesn't contain any value.
It is because I did not pay enough attention. Apologies.
Please, try the following:
Code:
perl -ne '@c=/(\w+)"\s+:\s+("[^"]*")/ and push @h, $c[0] and push @r, $c[1]; END{$"=","; print "@h\n@r\n"}' naveen.file


Output:

Code:
branchId,branchName,branchProductId,PreferenceId,emailId,firstName,lastName,middleName,userId
"BNSFGDJNSJG-73264HB-132131BNHJFSDG","NEWYORK-SSDF","72Y5HFHSF7H3RUNAWEF","BASDBVcbzcYHcb","hjbasf7hh4aBSDFU@mymail.com","FNM","LNM","","LSDFJDSJMDFG238487ybsgfd"


Last edited by Aia; 06-06-2016 at 01:43 AM..
This User Gave Thanks to Aia For This Post:
# 7  
Old 06-07-2016
Quote:
Originally Posted by Aia
It is because I did not pay enough attention. Apologies.
Please, try the following:
Code:
perl -ne '@c=/(\w+)"\s+:\s+("[^"]*")/ and push @h, $c[0] and push @r, $c[1]; END{$"=","; print "@h\n@r\n"}' naveen.file


Output:

Code:
branchId,branchName,branchProductId,PreferenceId,emailId,firstName,lastName,middleName,userId
"BNSFGDJNSJG-73264HB-132131BNHJFSDG","NEWYORK-SSDF","72Y5HFHSF7H3RUNAWEF","BASDBVcbzcYHcb","hjbasf7hh4aBSDFU@mymail.com","FNM","LNM","","LSDFJDSJMDFG238487ybsgfd"

Thanks Aia. Also, if we have two or more sets of data instead of one as mentioned below

Code:
{
  "branchId" : "BNSFGDJNSJG-73264HB-132131BNHJFSDG",
  "branchName" : "NEWYORK-SSDF",
  "branchProductId" : "72Y5HFHSF7H3RUNAWEF",
  "PreferenceId" : "BASDBVcbzcYHcb",
  "emailId" : "hjbasf7hh4aBSDFU@mymail.com",
  "firstName" : "FNM",
  "lastName" : "LNM",
  "middleName" : "",
  "userId" : "LSDFJDSJMDFG238487ybsgfd"
}

{
  "branchId" : "BNSFGDJNSJG-73264HB-132131BNHJFSDG",
  "branchName" : "NEWYORK-SSDF",
  "branchProductId" : "72Y5HFHSF7H3RUNAWEF",
  "userId" : "LSDFJDSJMDFG238487ybsgfd"
}

And what changes are required to get the output as below

Code:
branchId,branchName,branchProductId,PreferenceId,emailId,firstName,lastName,middleName,userId
"BNSFGDJNSJG-73264HB-132131BNHJFSDG","NEWYORK-SSDF","72Y5HFHSF7H3RUNAWEF","BASDBVcbzcYHcb", "hjbasf7hh4aBSDFU@mymail.com", "FNM","LNM","","LSDFJDSJMDFG238487ybsgfd"

branchId,branchName,branchProductId,userId
"BNSFGDJNSJG-73264HB-132131BNHJFSDG","NEWYORK-SSDF","72Y5HFHSF7H3RUNAWEF","LSDFJDSJMDFG238487ybsgfd"

 
Login or Register to Ask a Question

Previous Thread | Next Thread

10 More Discussions You Might Find Interesting

1. UNIX for Beginners Questions & Answers

Data extraction and converting into .csv file.

Hi All, I have a data file and need to extract and convert it into csv format: 1) Read and extract the line containing string ending with "----" (file sample_linebyline.txt file) and to make a .csv file from this. 2) To read the flat file flatfile_sample.txt which consists of similar data (... (9 Replies)
Discussion started by: abhi_123
9 Replies

2. Shell Programming and Scripting

Converting data from specific columns

i have a file (csv or txt or anything which has 4 columns (id,name,number,location) and it contains data. i want to convert the data of specific columns like name to ooooo and number to 88888 matching the field length of that columns. for example if name column has anthony which is 7, it should... (2 Replies)
Discussion started by: prajaktaraut
2 Replies

3. Shell Programming and Scripting

[Solved] Converting the data into matrix with 0's and 1's

I have a file that contains 2 columns tag,pos cat input_file tag pos atg 10 ata 16 agt 15 agg 19 atg 17 agg 14 I have used following command to sort the file based on second column sort -k 2 input_file tag pos atg 10 agg 14 agt 15 ata 16 agg 19 atg 17 (2 Replies)
Discussion started by: raj_k
2 Replies

4. Shell Programming and Scripting

[SOLVED] Converting data from one format to the other

Hi All, I need to convert an exel spreadsheet into a SAS dataset, and the following format change is needed. Please help, this is too complex for a biologist. Let me describe the input. 1st row is generation.1st column in keyword 'generation', starting 2nd column there are 5... (9 Replies)
Discussion started by: newbie83
9 Replies

5. Shell Programming and Scripting

Converting variable space width data into CSV data in bash

Hi All, I was wondering how I can convert each line in an input file where fields are separated by variable width spaces into a CSV file. Below is the scenario what I am looking for. My Input data in inputfile.txt 19 15657 15685 Sr2dReader 107.88 105.51... (4 Replies)
Discussion started by: vharsha
4 Replies

6. Shell Programming and Scripting

Help with Converting UTF-8 data to Unicode

How can I get an error when converting 3rd line, since it has invalid characters abcde a®cdée a�cd� Unicode for ® = ® é = é I used "iconv -f UTF-8 -t ISO-8859-15 in.txt > out.txt" (2 Replies)
Discussion started by: arunbs
2 Replies

7. Shell Programming and Scripting

Help converting row data to columns

I've been trying to figure this out for a while but I'm completely stumped. I have files with data in rows and I need to convert the data to columns. Each record contains four rows with a "field name: value" pair. I would like to convert it to four columns with the field names as column headers... (5 Replies)
Discussion started by: happy_ee
5 Replies

8. Shell Programming and Scripting

Extract data based on match against one column data from a long list data

My input file: data_5 Ali 422 2.00E-45 102/253 140/253 24 data_3 Abu 202 60.00E-45 12/23 140/23 28 data_1 Ahmad 256 7.00E-45 120/235 140/235 22 data_4 Aman 365 8.00E-45 15/65 140/65 20 data_10 Jones 869 9.00E-45 65/253 140/253 18... (12 Replies)
Discussion started by: patrick87
12 Replies

9. UNIX for Dummies Questions & Answers

converting a tabular format data to comma seperated data in KSH

Hi, Could anyone help me in changing a tabular format output to comma seperated file pls in K-sh. Its very urgent. E.g : username empid ------------------------ sri 123 to username,empid sri,123 Thanks, Hema:confused: (2 Replies)
Discussion started by: Hemamalini
2 Replies

10. Shell Programming and Scripting

Converting HTML data into a spreadsheet

Hi, I have a perl script that prints some data in the form of a table (HTML table) Now, I want to be able to convert this data into a report on an Excel sheet. How can I do this? Regards, Garric (4 Replies)
Discussion started by: garric
4 Replies
Login or Register to Ask a Question