Archive

Archive for February 17, 2013

Load JSON to SQL Server – Challenges and Alternatives(MongoDB)

February 17, 2013 2 comments

Have you ever tried loading JSON type data into SQL Server? If yes then you would agree that it is not easy to get it done.
JSON (JavaScript Object Notation) is a lightweight data-interchange format. JSON is widely used everywhere now and mostly used by web developer to create stateful web applications.

Simplified example :

{
        “name” : “James”,
        “Age” : 50,
        “Favourite_Color” : “Purple”
}

In other terms JSON document is collection of key\value which can be nested to several lavels . JSON document(Every record is called document in JSON) can have several subdocuments with multiple level of arrays and lists.

So now for one new project we had a requirement to load vendor data into sql server.This vendor data was in JSON format with really complex structure and several level of nesting. We were getting around 20 JSON files daily from vendor that I was needed to load in DB.

There is no easy way to load JSON data in Sql Server. No in-built method\function or SSIS component for this task. I bingoogle to see how others are doing it since this format is widely used.
Didn’t find anything except one simple-talk blog post from Phil Factor. Post

Really good article. Author created function using TSQL for complex string manipulation to parse JSON type data. I tried it first with some sample data which worked fine but data sent to us was really huge (Some of the files were around 1 GB) with crazy complex data nesting.
Process kept failing with one or other error .So first I tried to modify and enhance this function so that it works for my dataset but file structure was quite complex to make it work.

I have no idea why Sql Server doesn’t provide easy way to do it .. I see there is connect case opened for this feature with overwhelming support but nothing from MS yet 😦 I also commented on it a while back. Please upvote if you are struggling to load JSON data into Sql Server.
Here is connect item link

Next I tried c# to convert JSON data to xml and then use ssis to load it in DB. .net have JSON serializer and deserializer classes available for this but I used NewtonSoft JSON extension which really made it easy. Once file is converted to xml then ssis loaded it to DB.

Well this worked but (there is always a but 🙂 ) setting up xml load wasn’t easy. First conversion to xml was slow and then one of the converted xml file was loading around 19 tables . So you can guess the nesting here. We were getting different files every now and then which made this process difficult to maintain but we had no option but to live with it with quite a bit of manual intervention every now and then 😦

Now lets talk about other alternates(NoSQL).
There are several NoSql platform that allow loading these JSON files easy. MongoDB, CouchDB etc.
After all the dust settled down i started looking into some of these options. MongoDB for one is really easy to set up and work with. I won’t go into installations and set up as it is easily available on MongoDB website. Once MongoDB is installed then loading JSON data is piece of cake.
Now to load JSON file in customer DB i would run one single line command.

At command prompt in bin directory:

>mongoimport –database orders –collection ordDetail < c:\orders.json

All done. File loaded. If db (orders in my case) doesn’t exist already then above mongoimport command will create database. Collection represents table in MongoDB and if
collection(ordDetail in my case) doesn’t exist already then it will be created too. MongoDB is flexible schema database system which works great for json, csv and tsv format files.

Once file is loaded then here are some basic MongoDB queries to check data.

Count how many documents loaded or total number of documents in ordDetail collection. This is equivalent to select count(*) from table in relational db’s.

>db.ordDetail.count()

To fetch one row\document from collection.

>db.ordDetail.findOne()

Sample output after masking data.

{
        “_id” : ObjectId(“511dc7e3476eee41aa7a073a”),
        “name” : “Burroug’s”,
        “OrderNumber” : “9187157999-3909592”,
        “Amount” : 48.4,
        “impression” : 0,
        “customField1” : “jhhgh10”,

}

To query everything: (Select * from table )

>db.ordDetail.find()

I would blog about MongoDB query commands in future posts.

Application owner wanted to move application to MongoDB now but project was put on hold as there are several other challenges comes into picture when working with NoSQL dbs.

Argument continues 🙂

To conclude here is my journey to try to load JSON in Sql Server. Quite a few other RDBMS already have functionality to load JSON data . I hope MS would add this feature in Sql Server soon as .Net framework already have several JSON related classes.

Thanks,
Neeraj

Categories: Uncategorized Tags: , ,