Best Method For Query Content In Large JSON Files


 
Thread Tools Search this Thread
Top Forums Programming Best Method For Query Content In Large JSON Files
# 1  
Old 04-15-2016
Best Method For Query Content In Large JSON Files

I wanted to know what is the best way to query json formatted files for content? Ex. Data
https://usn.ubuntu.com/usn-db/database-all.json.bz2
When looking at keys as in:
Code:
import json
json_data = json.load(open('database-all.json'))
for keys in json_data.iterkeys():
    print 'Keys--> {} \n'.format(keys)
Keys--> 688-1 

Keys--> 1579-1 

Keys--> 2870-2 

Keys--> 2870-1 

Keys--> 1107-1 
blah blah blah

I can see the data structure
Code:
json_data['1004-1']
{u'action': u'In general, a standard system update will make all the necessary changes.\n',
 u'cves': [u'CVE-2010-3082'],
 u'description': u'It was discovered that Django did not properly sanitize the cookie value\nwhen applying CSRF protections resulting in a cross-site scripting (XSS)\nvulnerability. With cross-site scripting vulnerabilities, if a user were\ntricked into viewing server output during a crafted server request, a\nremote attacker could exploit this to modify the contents, or steal\nconfidential data, within the same domain.\n',
 u'id': u'1004-1',
 u'isummary': u'Django could be made to insert arbitrary content into web forms.\n',
 u'releases': {u'maverick': {u'archs': {u'all': {u'urls': {u'http://security.ubuntu.com/ubuntu/pool/main/p/python-django/python-django-doc_1.2.3-1ubuntu0.1_all.deb': {u'md5': u'5f3ed62933c8f4970101ead2d57d7d4f',
       u'size': 1905856},
      u'http://security.ubuntu.com/ubuntu/pool/main/p/python-django/python-django_1.2.3-1ubuntu0.1_all.deb': {u'md5': u'8c85dcb4ab4d9701cd546e2e119ae4e3',
       u'size': 4212250}}},
    u'source': {u'urls': {u'http://security.ubuntu.com/ubuntu/pool/main/p/python-django/python-django_1.2.3-1ubuntu0.1.debian.tar.gz': {u'md5': u'2e8c4c95d6d40cce184131f1001a01a2',
       u'size': 18499},
      u'http://security.ubuntu.com/ubuntu/pool/main/p/python-django/python-django_1.2.3-1ubuntu0.1.dsc': {u'md5': u'a5cb861587d952430ae73da49a9680cf',
       u'size': 2249},
      u'http://security.ubuntu.com/ubuntu/pool/main/p/python-django/python-django_1.2.3.orig.tar.gz': {u'md5': u'10bfb5831bcb4d3b1e6298d0e41d6603',
       u'size': 6306760}}}},
   u'binaries': {u'python-django': {u'version': u'1.2.3-1ubuntu0.1'}},
   u'sources': {u'python-django': {u'description': u'High-level Python web development framework',
     u'version': u'1.2.3-1ubuntu0.1'}}}},
 u'summary': u'python-django vulnerability',
 u'timestamp': 1287004073.841373,
 u'title': u'Django vulnerability'}

and can access each level easily as in:
Code:
json_data['1004-1']['action']
u'In general, a standard system update will make all the necessary changes.\n
json_data['1004-1']['releases']['maverick']
{u'archs': {u'all': {u'urls': {u'http://security.ubuntu.com/ubuntu/pool/main/p/python-django/python-django-doc_1.2.3-1ubuntu0.1_all.deb': {u'md5': u'5f3ed62933c8f4970101ead2d57d7d4f',
     u'size': 1905856},
    u'http://security.ubuntu.com/ubuntu/pool/main/p/python-django/python-django_1.2.3-1ubuntu0.1_all.deb': {u'md5': u'8c85dcb4ab4d9701cd546e2e119ae4e3',
     u'size': 4212250}}},
  u'source': {u'urls': {u'http://security.ubuntu.com/ubuntu/pool/main/p/python-django/python-django_1.2.3-1ubuntu0.1.debian.tar.gz': {u'md5': u'2e8c4c95d6d40cce184131f1001a01a2',
     u'size': 18499},
    u'http://security.ubuntu.com/ubuntu/pool/main/p/python-django/python-django_1.2.3-1ubuntu0.1.dsc': {u'md5': u'a5cb861587d952430ae73da49a9680cf',
     u'size': 2249},
    u'http://security.ubuntu.com/ubuntu/pool/main/p/python-django/python-django_1.2.3.orig.tar.gz': {u'md5': u'10bfb5831bcb4d3b1e6298d0e41d6603',
     u'size': 6306760}}}},
 u'binaries': {u'python-django': {u'version': u'1.2.3-1ubuntu0.1'}},
 u'sources': {u'python-django': {u'description': u'High-level Python web development framework',
   u'version': u'1.2.3-1ubuntu0.1'}}}

but when I attempt to iterate through the keys and values, I cant do so easy. My goal is to simply loop through this entire json structured file and print out values that I would like to see as in:
Code:
json_data['1004-1']['cves']
[u'CVE-2010-3082']
json_data['1004-1']['releases']['maverick']['archs']['all']['urls']
{u'http://security.ubuntu.com/ubuntu/pool/main/p/python-django/python-django-doc_1.2.3-1ubuntu0.1_all.deb': {u'md5': u'5f3ed62933c8f4970101ead2d57d7d4f',
  u'size': 1905856},
 u'http://security.ubuntu.com/ubuntu/pool/main/p/python-django/python-django_1.2.3-1ubuntu0.1_all.deb': {u'md5': u'8c85dcb4ab4d9701cd546e2e119ae4e3',
  u'size': 4212250}}

I have made many attempts to use the functions in the "json" module:
Code:
for keys, values in json_data.iteritems():
    print 'Keys--> {} \n'.format(keys)
    for val in values.iteritems():
        print u'Values--> {} \n'.format(val)
Keys--> 2251-1 

Values--> (u'description', u'A bounds check error was discovered in the socket filter subsystem of the\nLinux kernel. A local user could exploit this flaw to cause a denial of\nservice (system crash) via crafted BPF instructions. (CVE-2014-3144)\n\nA remainder calculation error was discovered in the socket filter subsystem\nof the Linux kernel. A local user could exploit this flaw to cause a denial\nof service (system crash) via crafted BPF instructions. (CVE-2014-3145)\n')
blah blah blah

but doesnt work when I attempt to do any type of searching as in:
Code:
In [65]: for keys, values in json_data.iteritems():
    print 'Keys--> {} \n'.format(keys)
    for val in values.iteritems():
        print u'Values--> {} \n'.format(val['releases']['maverick']['archs']['all']['urls'])
   ....:         
Keys--> 756-1 

---------------------------------------------------------------------------
TypeError                                 Traceback (most recent call last)
<ipython-input-65-ea4c264328b0> in <module>()
      2     print 'Keys--> {} \n'.format(keys)
      3     for val in values.iteritems():
----> 4         print u'Values--> {} \n'.format(val['releases']['maverick']['archs']['all']['urls'])
      5 

TypeError: tuple indices must be integers, not str

God help me !!!!!!!!!!!!

Last edited by metallica1973; 04-15-2016 at 06:47 PM..
Login or Register to Ask a Question

Previous Thread | Next Thread

10 More Discussions You Might Find Interesting

1. Shell Programming and Scripting

Large Variable content size caveats?

Hi, I wrote a shell script, which let me manage dns records through an api. The raw core-command looks like about this: output="$(curl -X GET https://mgt.myserver.de:8081/api/v1/servers/localhost/zones)"The output contains a list of all zones with all records and is about 800 Kilobytes... (9 Replies)
Discussion started by: stomp
9 Replies

2. Shell Programming and Scripting

Split JSON to different data files

Hi Gurus, I have below JSON file, now I want to rewrite this file into a new file. I will appreciate if anyone can help me to provide the solution...I can't use jq. { "_id": "3ad893cb4cf1560add7b4caffd4b6126", "_rev": "1-1f0ce165e1d210319cf6e9f9c6ff654f", "name":... (4 Replies)
Discussion started by: manas_ranjan
4 Replies

3. Shell Programming and Scripting

Most efficient method to extract values from text files

I have a list of files defined in a single file , one on each line.(No.of files may wary each time) eg. content of ETL_LOOKUP.dat /data/project/randomname /data/project/ramname /data/project/raname /data/project/radomname /data/project/raame /data/project/andomname size of these... (5 Replies)
Discussion started by: h0x0r21
5 Replies

4. UNIX for Advanced & Expert Users

best method to compress files in aix 5.3

Good morning, I have a file that is 200 MB and I want to compress it to the fullest, which is the best method or command to compress files in aix? Thank you very much and best regards. (8 Replies)
Discussion started by: systemoper
8 Replies

5. UNIX for Advanced & Expert Users

best method to compare 2 big files in unix

Hi , I have a requirement to compare 2 files which can contain 40 million or more records and more than 20 fields to compare . Currently I am using awk scripting , and since awk has a memory issue, I am not able to process file more than 10 million records. Any suggestions or pointers to... (7 Replies)
Discussion started by: rashmisb
7 Replies

6. Shell Programming and Scripting

Query for replacing a string and keeping the non-replaced content

Hi experts, As i am a novice unix player...so need help for the below query...banged my head from quite a while...:confused: i have a set of html files, in which i need to search for string "Page"(case sensitive) and then replace the same with some numeric code ,say, "XXX1234". Here in... (2 Replies)
Discussion started by: rahulfhp
2 Replies

7. Web Development

Content Management System for uploading large files

Hi everybody, I am currently trying to develop a simple content management system where I have an internal website for my users to upload large files onto the server. The site is password protected and my users won't be trying to hack into the system so security is a non-factor (as least for... (3 Replies)
Discussion started by: z1dane
3 Replies

8. Red Hat

Method to Unpack cpio files

Hi all, I want to unpack some files .Files and their sizes are: 1. Linux9i_Disk1.cpio -- 500m 2. Linux9i_Disk2.cpio--- 600m 3.Linux9i_Disk3.cpio---- 250m I used cpio -idmv Linux9i_Disk1.cpio command to unpack the files. But Its taking more time to unpack the files.What could be the... (2 Replies)
Discussion started by: William1482
2 Replies

9. Shell Programming and Scripting

Removing lines from large files.. quickest method?

Hi I have some files that contain be anything up to 100k lines - eg. file100k I have another file called file5k and I need to produce filec which will contain everything in file100k minus what matches in file 5k.. ie. File100k contains 1FP 2FP 3FP File5k contains 2FP I would... (2 Replies)
Discussion started by: frustrated1
2 Replies

10. Solaris

Secure method - get files - no ssh/scp

I have two servers. server 1 - secure server running ssh only on solaris 9. I can installed whatever I need here as long as its reasonably secure server 2 - running telnet (no ssh/scp installed) - I cannot change much on this server and cannot install much.. I need to pull some files from... (3 Replies)
Discussion started by: frustrated1
3 Replies
Login or Register to Ask a Question