Converting unstructured data to structured data


 
Thread Tools Search this Thread
Top Forums UNIX for Dummies Questions & Answers Converting unstructured data to structured data
# 8  
Old 06-07-2016
In the meantime, you might want to try this... I'd like to hear from you if it works in general, even if you prefer the perl thingy Smilie
Code:
#!/bin/bash

header=(); values=(); start=0

while IFS=":" read var val
do

 if [[ "$var" = "{" && "$start" = "0" ]]; then # "new" set detected
  start=1
  continue

 elif [[ "$var" = "" ]]; then
  echo
  continue

 elif [[ "$var" = "}" ]]; then # end of a set
  printf "%s\n%s\n" "${header[@]}" "${values[@]}"
  start=0; header=(); values=()
  continue

 else # process stuff within a set

  var="${var//\"/}"; var="$(echo $var)"
   [[ "${#header[@]}" -eq "0" ]] && header=("$var") || header=("${header[@]}","$var")

  val="${val# }"; val="${val%,}"
   [[ "${#values[@]}" -eq "0" ]] && values=("$val") || values=("${values[@]}","$val")

 fi

done <inputfile

# 9  
Old 06-07-2016
Quote:
Originally Posted by naveen.kuppili
Thanks Aia. Also, if we have two or more sets of data instead of one as mentioned below

Code:
{
  "branchId" : "BNSFGDJNSJG-73264HB-132131BNHJFSDG",
  "branchName" : "NEWYORK-SSDF",
  "branchProductId" : "72Y5HFHSF7H3RUNAWEF",
  "PreferenceId" : "BASDBVcbzcYHcb",
  "emailId" : "hjbasf7hh4aBSDFU@mymail.com",
  "firstName" : "FNM",
  "lastName" : "LNM",
  "middleName" : "",
  "userId" : "LSDFJDSJMDFG238487ybsgfd"
}

{
  "branchId" : "BNSFGDJNSJG-73264HB-132131BNHJFSDG",
  "branchName" : "NEWYORK-SSDF",
  "branchProductId" : "72Y5HFHSF7H3RUNAWEF",
  "userId" : "LSDFJDSJMDFG238487ybsgfd"
}

And what changes are required to get the output as below

Code:
branchId,branchName,branchProductId,PreferenceId,emailId,firstName,lastName,middleName,userId
"BNSFGDJNSJG-73264HB-132131BNHJFSDG","NEWYORK-SSDF","72Y5HFHSF7H3RUNAWEF","BASDBVcbzcYHcb", "hjbasf7hh4aBSDFU@mymail.com", "FNM","LNM","","LSDFJDSJMDFG238487ybsgfd"

branchId,branchName,branchProductId,userId
"BNSFGDJNSJG-73264HB-132131BNHJFSDG","NEWYORK-SSDF","72Y5HFHSF7H3RUNAWEF","LSDFJDSJMDFG238487ybsgfd"

As long as the file looks like you have posted, this might do it:

Code:
perl -nle 'BEGIN{$/="\n\n"; $"=","} @h=@r=(); while(/(\w+)"\s+:\s+("[^"]*")/g){push @h,$1; push @r,$2} print "@h\n@r\n"' naveen.file

Output:
Code:
branchId,branchName,branchProductId,PreferenceId,emailId,firstName,lastName,middleName,userId
"BNSFGDJNSJG-73264HB-132131BNHJFSDG","NEWYORK-SSDF","72Y5HFHSF7H3RUNAWEF","BASDBVcbzcYHcb","hjbasf7hh4aBSDFU@mymail.com","FNM","LNM","","LSDFJDSJMDFG238487ybsgfd"

branchId,branchName,branchProductId,userId
"BNSFGDJNSJG-73264HB-132131BNHJFSDG","NEWYORK-SSDF","72Y5HFHSF7H3RUNAWEF","LSDFJDSJMDFG238487ybsgfd"

# 10  
Old 06-07-2016
A little bit playin with ruby....

Code:
#!/usr/bin/env ruby 

# call it with: ./thisfile.rb data.txt

word  = '[^"]*'
block = '{[^}]+}'

File.open(ARGV[0]).read.scan(/#{block}/m).map{ |current_block|
        puts current_block.scan(/^\s*"(#{word})"/m).map{|key|key[0]}.join(",") + "\n" +
             current_block.scan(/^\s*"#{word}" : ("#{word}")/m).map{|val|val[0]}.join(",")  
        }


Last edited by stomp; 06-07-2016 at 11:39 PM..
 
Login or Register to Ask a Question

Previous Thread | Next Thread

10 More Discussions You Might Find Interesting

1. UNIX for Beginners Questions & Answers

Data extraction and converting into .csv file.

Hi All, I have a data file and need to extract and convert it into csv format: 1) Read and extract the line containing string ending with "----" (file sample_linebyline.txt file) and to make a .csv file from this. 2) To read the flat file flatfile_sample.txt which consists of similar data (... (9 Replies)
Discussion started by: abhi_123
9 Replies

2. Shell Programming and Scripting

Converting data from specific columns

i have a file (csv or txt or anything which has 4 columns (id,name,number,location) and it contains data. i want to convert the data of specific columns like name to ooooo and number to 88888 matching the field length of that columns. for example if name column has anthony which is 7, it should... (2 Replies)
Discussion started by: prajaktaraut
2 Replies

3. Shell Programming and Scripting

[Solved] Converting the data into matrix with 0's and 1's

I have a file that contains 2 columns tag,pos cat input_file tag pos atg 10 ata 16 agt 15 agg 19 atg 17 agg 14 I have used following command to sort the file based on second column sort -k 2 input_file tag pos atg 10 agg 14 agt 15 ata 16 agg 19 atg 17 (2 Replies)
Discussion started by: raj_k
2 Replies

4. Shell Programming and Scripting

[SOLVED] Converting data from one format to the other

Hi All, I need to convert an exel spreadsheet into a SAS dataset, and the following format change is needed. Please help, this is too complex for a biologist. Let me describe the input. 1st row is generation.1st column in keyword 'generation', starting 2nd column there are 5... (9 Replies)
Discussion started by: newbie83
9 Replies

5. Shell Programming and Scripting

Converting variable space width data into CSV data in bash

Hi All, I was wondering how I can convert each line in an input file where fields are separated by variable width spaces into a CSV file. Below is the scenario what I am looking for. My Input data in inputfile.txt 19 15657 15685 Sr2dReader 107.88 105.51... (4 Replies)
Discussion started by: vharsha
4 Replies

6. Shell Programming and Scripting

Help with Converting UTF-8 data to Unicode

How can I get an error when converting 3rd line, since it has invalid characters abcde a®cdée a�cd� Unicode for ® = ® é = é I used "iconv -f UTF-8 -t ISO-8859-15 in.txt > out.txt" (2 Replies)
Discussion started by: arunbs
2 Replies

7. Shell Programming and Scripting

Help converting row data to columns

I've been trying to figure this out for a while but I'm completely stumped. I have files with data in rows and I need to convert the data to columns. Each record contains four rows with a "field name: value" pair. I would like to convert it to four columns with the field names as column headers... (5 Replies)
Discussion started by: happy_ee
5 Replies

8. Shell Programming and Scripting

Extract data based on match against one column data from a long list data

My input file: data_5 Ali 422 2.00E-45 102/253 140/253 24 data_3 Abu 202 60.00E-45 12/23 140/23 28 data_1 Ahmad 256 7.00E-45 120/235 140/235 22 data_4 Aman 365 8.00E-45 15/65 140/65 20 data_10 Jones 869 9.00E-45 65/253 140/253 18... (12 Replies)
Discussion started by: patrick87
12 Replies

9. UNIX for Dummies Questions & Answers

converting a tabular format data to comma seperated data in KSH

Hi, Could anyone help me in changing a tabular format output to comma seperated file pls in K-sh. Its very urgent. E.g : username empid ------------------------ sri 123 to username,empid sri,123 Thanks, Hema:confused: (2 Replies)
Discussion started by: Hemamalini
2 Replies

10. Shell Programming and Scripting

Converting HTML data into a spreadsheet

Hi, I have a perl script that prints some data in the form of a table (HTML table) Now, I want to be able to convert this data into a report on an Excel sheet. How can I do this? Regards, Garric (4 Replies)
Discussion started by: garric
4 Replies
Login or Register to Ask a Question