Split JSON to different data files


 
Thread Tools Search this Thread
Top Forums Shell Programming and Scripting Split JSON to different data files
# 1  
Old 01-20-2018
Split JSON to different data files

Hi Gurus,

I have below JSON file, now I want to rewrite this file into a new file.
I will appreciate if anyone can help me to provide the solution...I can't use jq.
Code:
{
   "_id": "3ad893cb4cf1560add7b4caffd4b6126",
   "_rev": "1-1f0ce165e1d210319cf6e9f9c6ff654f",
   "name": “couchdb_1.couchdb",
   "type": "couchdb",
   "ts": 1445785730,
   "couchdb": { 
       "auth_cache_misses": { "current": null, "sum": null, "mean":  null, "stddev": null, "min": null, "max": null },
       "database_writes":   { "current": 1955, "sum": 1955, "mean": 0.004, "stddev": 0.061, "min": 0, "max": 1 },
       "open_databases":    { "current": 47, "sum": 47, "mean": 0, "stddev": 0.03, "min": 0, "max": 14 },
       "auth_cache_hits":   { "current": null, "sum": null, "mean": null, "stddev": null, "min": null, "max": null },
       "request_time":      { "current": 934798.325, "sum": 934798.325, "mean": 247.236, "stddev": 9323.841, "min": 0, "max": 415733 },
       "database_reads":    { "current": 688315, "sum": 688315, "mean": 1.316, "stddev": 69.941, "min": 0, "max": 5497 },
       "open_os_files":     { "current": 101, "sum": 101, "mean": 0, "stddev": 0.061, "min": -1, "max": 28 }
   },
   "httpd_request_methods": {
       "PUT":    { "current": 18, "sum": 18, "mean": 0, "stddev": 0.009, "min": 0, "max": 1 },
       "GET":    { "current": 11172, "sum": 11172, "mean": 0.021, "stddev": 0.747, "min": 0, "max": 66 },
       "COPY":   { "current": null, "sum": null, "mean": null, "stddev": null, "min": null, "max": null },
       "DELETE": { "current": 2, "sum": 2, "mean": 0, "stddev": 0.003, "min": 0, "max": 1 },
       "POST":   { "current": 1948, "sum": 1948, "mean": 0.004, "stddev": 0.061, "min": 0, "max": 1 },
       "HEAD":   { "current": 1, "sum": 1, "mean": 0, "stddev": 0.004, "min": 0, "max": 1 } 
   },
   "httpd_status_codes": { 
       "200": { "current": 9073, "sum": 9073, "mean": 0.017, "stddev": 0.589, "min": 0, "max": 53 },
       "201": { "current": 1949, "sum": 1949, "mean": 0.004, "stddev": 0.061, "min": 0, "max": 1},
       "202": { "current": null, "sum": null, "mean": null, "stddev": null, "min": null, "max": null },
       "301": { "current": null, "sum": null, "mean": null, "stddev": null, "min": null, "max": null },
       "304": { "current": 81, "sum": 81, "mean": 0, "stddev": 0.026, "min": 0, "max": 3 },
       "400": { "current": 2, "sum": 2, "mean": 0, "stddev": 0.005, "min": 0, "max": 1 },
       "401": { "current": null, "sum": null, "mean": null, "stddev": null, "min": null, "max": null },
       "403": { "current": null, "sum": null, "mean": null, "stddev": null, "min": null, "max": null },
       "404": { "current": 1585, "sum": 1585, "mean": 0.007, "stddev": 0.375, "min": 0, "max": 33 },
       "405": { "current": null, "sum": null, "mean": null, "stddev": null, "min": null, "max": null },
       "409": { "current": 4, "sum": 4, "mean": 0, "stddev": 0.008, "min": 0, "max": 1 },
       "412": { "current": 2, "sum": 2, "mean": 0, "stddev": 0.006, "min": 0, "max": 1 },
       "500": { "current": 1, "sum": 1, "mean": 0, "stddev": 0.004, "min": 0, "max": 1 }
   },
   "httpd": {
       "clients_requesting_changes": { "current": 0, "sum": 0, "mean": 0, "stddev": 0.033, "min": -2, "max": 2 },
       "temporary_view_reads":       { "current": 4, "sum": 4, "mean": 0, "stddev": 0.008, "min": 0, "max": 1 },
       "requests":                   { "current": 12186, "sum": 12186, "mean": 0.023, "stddev": 0.751, "min": 0, "max": 66 },
       "bulk_requests":              { "current": 1920, "sum": 1920, "mean": 0.004, "stddev": 0.06, "min": 0, "max": 1 },
       "view_reads":                 { "current": 206, "sum": 206, "mean": 0.003, "stddev": 0.062, "min": 0, "max": 2 }
   }
}

Now data file should be couchdb.txt
with content as below (if NULL then 0)
Code:
couchdb,couchdb=auth_cache_misses, current=0,sum=0, mean=0,stddev=0, min=0, max=0
couchdb,couchdb=database_writes, current=1955, sum=1955, mean=0.004, stddev=0.061, min=0, max=1
......until couchdb block finished.

Then in the same file, next block httpd_request_methods will write:
Code:
couchdb, httpd_request_methods=PUT, current=18, sum=18, mean=0, stddev=0.009, min=0, max=1
couchdb, httpd_request_methods=GET, current= 11172, sum= 11172, mean=0.021, stddev=0.747, min=0, max=66
.....until httpd_request_methods finished.

next httpd_status_codes will write
Code:
couchdb, httpd_status_codes=200, current=9073,sum=9073,mean=0.017, stddev=0.589,min=0, max=53
couchdb, httpd_status_codes=201, current=1949, sum=1949, mean=0.004, stddev=0.061, min=0, max=1
.......
couchdb,httpd_status_codes=500,current=1,sum=1,mean=0,stddev=0.004, min=0, max=1
until httpd_status_codes finished

next httpd block will write
Code:
couchdb, httpd=clients_requesting_changes,current=0, sum=0, mean=0, stddev=0.033,min=-2, max=2
......
couchdb, httpd= view_reads,current=206,sum=206,mean=0.003,stddev=0.062,min=0, max=2
until httpd block finished.

# 2  
Old 01-20-2018
Since it is a JavaScript object try JavaScript using your favorite browser (then save the page to create a file, or write JavaScript to save to file (if the browser supports it)):
Code:
<html>
<head>
<script>
obj = {
   "_id": "3ad893cb4cf1560add7b4caffd4b6126",
   "_rev": "1-1f0ce165e1d210319cf6e9f9c6ff654f",
   "name": "couchdb_1.couchdb",
   "type": "couchdb",
   "ts": 1445785730,
   "couchdb": {
       "auth_cache_misses": { "current": null, "sum": null, "mean":  null, "stddev": null, "min": null, "max": null },
       "database_writes":   { "current": 1955, "sum": 1955, "mean": 0.004, "stddev": 0.061, "min": 0, "max": 1 },
       "open_databases":    { "current": 47, "sum": 47, "mean": 0, "stddev": 0.03, "min": 0, "max": 14 },
       "auth_cache_hits":   { "current": null, "sum": null, "mean": null, "stddev": null, "min": null, "max": null },
       "request_time":      { "current": 934798.325, "sum": 934798.325, "mean": 247.236, "stddev": 9323.841, "min": 0, "max": 415733 },
       "database_reads":    { "current": 688315, "sum": 688315, "mean": 1.316, "stddev": 69.941, "min": 0, "max": 5497 },
       "open_os_files":     { "current": 101, "sum": 101, "mean": 0, "stddev": 0.061, "min": -1, "max": 28 }
   },
   "httpd_request_methods": {
       "PUT":    { "current": 18, "sum": 18, "mean": 0, "stddev": 0.009, "min": 0, "max": 1 },
       "GET":    { "current": 11172, "sum": 11172, "mean": 0.021, "stddev": 0.747, "min": 0, "max": 66 },
       "COPY":   { "current": null, "sum": null, "mean": null, "stddev": null, "min": null, "max": null },
       "DELETE": { "current": 2, "sum": 2, "mean": 0, "stddev": 0.003, "min": 0, "max": 1 },
       "POST":   { "current": 1948, "sum": 1948, "mean": 0.004, "stddev": 0.061, "min": 0, "max": 1 },
       "HEAD":   { "current": 1, "sum": 1, "mean": 0, "stddev": 0.004, "min": 0, "max": 1 }
   },
   "httpd_status_codes": {
       "200": { "current": 9073, "sum": 9073, "mean": 0.017, "stddev": 0.589, "min": 0, "max": 53 },
       "201": { "current": 1949, "sum": 1949, "mean": 0.004, "stddev": 0.061, "min": 0, "max": 1},
       "202": { "current": null, "sum": null, "mean": null, "stddev": null, "min": null, "max": null },
       "301": { "current": null, "sum": null, "mean": null, "stddev": null, "min": null, "max": null },
       "304": { "current": 81, "sum": 81, "mean": 0, "stddev": 0.026, "min": 0, "max": 3 },
       "400": { "current": 2, "sum": 2, "mean": 0, "stddev": 0.005, "min": 0, "max": 1 },
       "401": { "current": null, "sum": null, "mean": null, "stddev": null, "min": null, "max": null },
       "403": { "current": null, "sum": null, "mean": null, "stddev": null, "min": null, "max": null },
       "404": { "current": 1585, "sum": 1585, "mean": 0.007, "stddev": 0.375, "min": 0, "max": 33 },
       "405": { "current": null, "sum": null, "mean": null, "stddev": null, "min": null, "max": null },
       "409": { "current": 4, "sum": 4, "mean": 0, "stddev": 0.008, "min": 0, "max": 1 },
       "412": { "current": 2, "sum": 2, "mean": 0, "stddev": 0.006, "min": 0, "max": 1 },
       "500": { "current": 1, "sum": 1, "mean": 0, "stddev": 0.004, "min": 0, "max": 1 }
   },
   "httpd": {
       "clients_requesting_changes": { "current": 0, "sum": 0, "mean": 0, "stddev": 0.033, "min": -2, "max": 2 },
       "temporary_view_reads":       { "current": 4, "sum": 4, "mean": 0, "stddev": 0.008, "min": 0, "max": 1 },
       "requests":                   { "current": 12186, "sum": 12186, "mean": 0.023, "stddev": 0.751, "min": 0, "max": 66 },
       "bulk_requests":              { "current": 1920, "sum": 1920, "mean": 0.004, "stddev": 0.06, "min": 0, "max": 1 },
       "view_reads":                 { "current": 206, "sum": 206, "mean": 0.003, "stddev": 0.062, "min": 0, "max": 2 }
   }
};

for (i in obj) {
   if ((typeof obj[i]) === "object") {
      for (j in obj[i]) {
         if ((typeof obj[i][j]) === "object") {
            document.write(obj.type + "," + i + "=" + j + ",");
            var c=0;
            for (k in obj[i][j]) {
               document.write(((c++==0) ? "" : ",") + k + "=" + ((! obj[i][j][k]) ? 0 : obj[i][j][k]));
            }
         }
         document.write("<br>");
      }
   }
}
</script>
</head>
</html>

If the browser cannot be used then run a utility that can interpret and run JavaScript in the command line or UI like PhantomJS or other.

Last edited by rdrtx1; 01-21-2018 at 11:22 AM..
This User Gave Thanks to rdrtx1 For This Post:
# 3  
Old 01-21-2018
rdrtx1's approach is probably the way to go, since it can parse the JSON syntax properly.

Here is an example what might be done with awk:
Code:
awk '
  /\{[ \t]*$/{
    level++
  }
  /^[ \t]*\},?[ \t]*$/ {
    level--; object=""
  } 
  level==1 && $2=="type"{
    close(objfile)
    objtype=$4
    objfile=objtype ".txt"
  }
  level==2 && object=="" {
    object=$2
    next
  }
  level==2 {
    line=objtype OFS object "=" $2
    for(i=4; i<=NF; i+=2) {
      split($(i+1),F,"[:, \t]*")
      line=line OFS $i "=" (F[2]=="null" ? 0 : F[2])
    }
    print line > objfile
  }
' FS=\" OFS=", " file

Output:
Code:
couchdb, couchdb=auth_cache_misses, current=0, sum=0, mean=0, stddev=0, min=0, max=0
couchdb, couchdb=database_writes, current=1955, sum=1955, mean=0.004, stddev=0.061, min=0, max=1
couchdb, couchdb=open_databases, current=47, sum=47, mean=0, stddev=0.03, min=0, max=14
couchdb, couchdb=auth_cache_hits, current=0, sum=0, mean=0, stddev=0, min=0, max=0
couchdb, couchdb=request_time, current=934798.325, sum=934798.325, mean=247.236, stddev=9323.841, min=0, max=415733
couchdb, couchdb=database_reads, current=688315, sum=688315, mean=1.316, stddev=69.941, min=0, max=5497
couchdb, couchdb=open_os_files, current=101, sum=101, mean=0, stddev=0.061, min=-1, max=28
couchdb, httpd_request_methods=PUT, current=18, sum=18, mean=0, stddev=0.009, min=0, max=1
couchdb, httpd_request_methods=GET, current=11172, sum=11172, mean=0.021, stddev=0.747, min=0, max=66
couchdb, httpd_request_methods=COPY, current=0, sum=0, mean=0, stddev=0, min=0, max=0
couchdb, httpd_request_methods=DELETE, current=2, sum=2, mean=0, stddev=0.003, min=0, max=1
couchdb, httpd_request_methods=POST, current=1948, sum=1948, mean=0.004, stddev=0.061, min=0, max=1
couchdb, httpd_request_methods=HEAD, current=1, sum=1, mean=0, stddev=0.004, min=0, max=1
couchdb, httpd_status_codes=200, current=9073, sum=9073, mean=0.017, stddev=0.589, min=0, max=53
couchdb, httpd_status_codes=201, current=1949, sum=1949, mean=0.004, stddev=0.061, min=0, max=1}
couchdb, httpd_status_codes=202, current=0, sum=0, mean=0, stddev=0, min=0, max=0
couchdb, httpd_status_codes=301, current=0, sum=0, mean=0, stddev=0, min=0, max=0
couchdb, httpd_status_codes=304, current=81, sum=81, mean=0, stddev=0.026, min=0, max=3
couchdb, httpd_status_codes=400, current=2, sum=2, mean=0, stddev=0.005, min=0, max=1
couchdb, httpd_status_codes=401, current=0, sum=0, mean=0, stddev=0, min=0, max=0
couchdb, httpd_status_codes=403, current=0, sum=0, mean=0, stddev=0, min=0, max=0
couchdb, httpd_status_codes=404, current=1585, sum=1585, mean=0.007, stddev=0.375, min=0, max=33
couchdb, httpd_status_codes=405, current=0, sum=0, mean=0, stddev=0, min=0, max=0
couchdb, httpd_status_codes=409, current=4, sum=4, mean=0, stddev=0.008, min=0, max=1
couchdb, httpd_status_codes=412, current=2, sum=2, mean=0, stddev=0.006, min=0, max=1
couchdb, httpd_status_codes=500, current=1, sum=1, mean=0, stddev=0.004, min=0, max=1
couchdb, httpd=clients_requesting_changes, current=0, sum=0, mean=0, stddev=0.033, min=-2, max=2
couchdb, httpd=temporary_view_reads, current=4, sum=4, mean=0, stddev=0.008, min=0, max=1
couchdb, httpd=requests, current=12186, sum=12186, mean=0.023, stddev=0.751, min=0, max=66
couchdb, httpd=bulk_requests, current=1920, sum=1920, mean=0.004, stddev=0.06, min=0, max=1
couchdb, httpd=view_reads, current=206, sum=206, mean=0.003, stddev=0.062, min=0, max=2

But this is only an approximation and it may work with this particular sample. However, if there is a variation in input file format, then it will likely fail, whereas rdtx1's approach will probably still work..

--
Note: the input file contains
Code:
   "name": couchdb_1.couchdb",

which has a wrong quote character. It should be:
Code:
   "name": "couchdb_1.couchdb",


Last edited by Scrutinizer; 01-21-2018 at 04:08 AM..
This User Gave Thanks to Scrutinizer For This Post:
# 4  
Old 01-21-2018
thanks, geniuses... I couldn't be happy more. now, I need to see if I can store this datafile in influx. let you guys know.
# 5  
Old 01-24-2018
Some python code for fun :-)

python code:
  1. #! /usr/bin/env python
  2.  
  3. from json import load
  4.  
  5. f_in = open('data.json')
  6. data = load(f_in)
  7. f_in.close()
  8.  
  9. f_out = open('output.dat', 'w')
  10. for key in ('couchdb', 'httpd_request_methods', 'httpd_status_codes'):
  11.     for k1, v1 in data&#91;key].iteritems():
  12.         string = "%s, %s=%s, " % ("couchdb", key, k1)
  13.         for k2, v2 in v1.iteritems():
  14.             string += "%s=%s, " % (k2, v2)
  15.         string = string.rstrip(', ') + '\n'
  16.         f_out.write(string)
  17. f_out.close()

Last edited by balajesuri; 01-24-2018 at 11:37 AM..
Login or Register to Ask a Question

Previous Thread | Next Thread

10 More Discussions You Might Find Interesting

1. Shell Programming and Scripting

Fun with terminal plotting JSON data at the command line

One of the great thing about unix is the ability to pipe multiple programs together to manipulate data. Plain, unstructured text is the most common type of data that is passed between programs, but these days JSON is becoming more popular. I thought it would be fun to pipe together some command... (1 Reply)
Discussion started by: kbrazil
1 Replies

2. UNIX for Beginners Questions & Answers

Automate splitting of files , scp files as each split completes and combine files on target server

i use the split command to split a one terabyte backup file into 10 chunks of 100 GB each. The files are split one after the other. While the files is being split, I will like to scp the files one after the other as soon as the previous one completes, from server A to Server B. Then on server B ,... (2 Replies)
Discussion started by: malaika
2 Replies

3. UNIX for Beginners Questions & Answers

Split and Rename Split Files

Hello, I need to split a file by number of records and rename each split file with actual filename pre-pended with 3 digit split number. What I have tried is the below command with 2 digit numeric value split -l 3 -d abc.txt F (# Will Produce split Files as F00 F01 F02) How to produce... (19 Replies)
Discussion started by: techedipro
19 Replies

4. Programming

Best Method For Query Content In Large JSON Files

I wanted to know what is the best way to query json formatted files for content? Ex. Data https://usn.ubuntu.com/usn-db/database-all.json.bz2 When looking at keys as in: import json json_data = json.load(open('database-all.json')) for keys in json_data.iterkeys(): print 'Keys--> {}... (0 Replies)
Discussion started by: metallica1973
0 Replies

5. Shell Programming and Scripting

Split a file into several files using a data

Hi All, I have file(File1) with data like below: 102100|LName|Gender|Company|Branch|Bday|Salary|Age 102100|bbbb|male|cccc|dddd|19900814|15000|20| 102101|asdg|male|gggg|ksgu|19911216||| 102102|bdbm|male|kkkk|acke|19931018||23| 102102|kfjg|male|kkkc|gkgg|19921213|14000|24|... (2 Replies)
Discussion started by: sarav.shan
2 Replies

6. Shell Programming and Scripting

split data by line

I would like break in two line by 'SNAG' Current data: SNAG|M1299063| | | | |0001.|0010.|AC64797|2008-02-18|093730.|YVR|AC64797|2008-02-18-09.37.30.250020|N|30|NO LEAKS OR CRACKS THIS A7 SCK SNAG|M1299063| | | |... (10 Replies)
Discussion started by: javeiregh
10 Replies

7. Shell Programming and Scripting

How to split a data file into separate files with the file names depending upon a column's value?

Hi, I have a data file xyz.dat similar to the one given below, 2345|98|809||x|969|0 2345|98|809||y|0|537 2345|97|809||x|544|0 2345|97|809||y|0|651 9685|98|809||x|321|0 9685|98|809||y|0|357 9685|98|709||x|687|0 9685|98|709||y|0|234 2315|98|809||x|564|0 2315|98|809||y|0|537... (2 Replies)
Discussion started by: nithins007
2 Replies

8. Shell Programming and Scripting

Help- counting delimiter in a huge file and split data into 2 files

I’m new to Linux script and not sure how to filter out bad records from huge flat files (over 1.3GB each). The delimiter is a semi colon “;” Here is the sample of 5 lines in the file: Name1;phone1;address1;city1;state1;zipcode1 Name2;phone2;address2;city2;state2;zipcode2;comment... (7 Replies)
Discussion started by: lv99
7 Replies

9. Shell Programming and Scripting

Split a huge data into few different files?!

Input file data contents: >seq_1 MSNQSPPQSQRPGHSHSHSHSHAGLASSTSSHSNPSANASYNLNGPRTGGDQRYRASVDA >seq_2 AGAAGRGWGRDVTAAASPNPRNGGGRPASDLLSVGNAGGQASFASPETIDRWFEDLQHYE >seq_3 ATLEEMAAASLDANFKEELSAIEQWFRVLSEAERTAALYSLLQSSTQVQMRFFVTVLQQM ARADPITALLSPANPGQASMEAQMDAKLAAMGLKSPASPAVRQYARQSLSGDTYLSPHSA... (7 Replies)
Discussion started by: patrick87
7 Replies

10. Shell Programming and Scripting

data break split

I am trying to figure out how to split a file when the data in the new line is different from the current line using a shell script? For eg.. if my input file contains the following 2341123 ABCAD 2341123 ANCAED 2341123 AVADV 3343434 ASDVAV 3343434 ASDFADF 4231232 ADACVAV 4231232... (3 Replies)
Discussion started by: gmatsoon
3 Replies
Login or Register to Ask a Question