Load JSON to SQL Server – Challenges and Alternatives(MongoDB)

February 17, 2013 2 comments

Have you ever tried loading JSON type data into SQL Server? If yes then you would agree that it is not easy to get it done.
JSON (JavaScript Object Notation) is a lightweight data-interchange format. JSON is widely used everywhere now and mostly used by web developer to create stateful web applications.

Simplified example :

        “name” : “James”,
        “Age” : 50,
        “Favourite_Color” : “Purple”

In other terms JSON document is collection of key\value which can be nested to several lavels . JSON document(Every record is called document in JSON) can have several subdocuments with multiple level of arrays and lists.

So now for one new project we had a requirement to load vendor data into sql server.This vendor data was in JSON format with really complex structure and several level of nesting. We were getting around 20 JSON files daily from vendor that I was needed to load in DB.

There is no easy way to load JSON data in Sql Server. No in-built method\function or SSIS component for this task. I bingoogle to see how others are doing it since this format is widely used.
Didn’t find anything except one simple-talk blog post from Phil Factor. Post

Really good article. Author created function using TSQL for complex string manipulation to parse JSON type data. I tried it first with some sample data which worked fine but data sent to us was really huge (Some of the files were around 1 GB) with crazy complex data nesting.
Process kept failing with one or other error .So first I tried to modify and enhance this function so that it works for my dataset but file structure was quite complex to make it work.

I have no idea why Sql Server doesn’t provide easy way to do it .. I see there is connect case opened for this feature with overwhelming support but nothing from MS yet :-( I also commented on it a while back. Please upvote if you are struggling to load JSON data into Sql Server.
Here is connect item link

Next I tried c# to convert JSON data to xml and then use ssis to load it in DB. .net have JSON serializer and deserializer classes available for this but I used NewtonSoft JSON extension which really made it easy. Once file is converted to xml then ssis loaded it to DB.

Well this worked but (there is always a but :-) ) setting up xml load wasn’t easy. First conversion to xml was slow and then one of the converted xml file was loading around 19 tables . So you can guess the nesting here. We were getting different files every now and then which made this process difficult to maintain but we had no option but to live with it with quite a bit of manual intervention every now and then :-(

Now lets talk about other alternates(NoSQL).
There are several NoSql platform that allow loading these JSON files easy. MongoDB, CouchDB etc.
After all the dust settled down i started looking into some of these options. MongoDB for one is really easy to set up and work with. I won’t go into installations and set up as it is easily available on MongoDB website. Once MongoDB is installed then loading JSON data is piece of cake.
Now to load JSON file in customer DB i would run one single line command.

At command prompt in bin directory:

>mongoimport –database orders –collection ordDetail < c:\orders.json

All done. File loaded. If db (orders in my case) doesn’t exist already then above mongoimport command will create database. Collection represents table in MongoDB and if
collection(ordDetail in my case) doesn’t exist already then it will be created too. MongoDB is flexible schema database system which works great for json, csv and tsv format files.

Once file is loaded then here are some basic MongoDB queries to check data.

Count how many documents loaded or total number of documents in ordDetail collection. This is equivalent to select count(*) from table in relational db’s.


To fetch one row\document from collection.


Sample output after masking data.

        “_id” : ObjectId(“511dc7e3476eee41aa7a073a”),
        “name” : “Burroug’s”,
        “OrderNumber” : “9187157999-3909592”,
        “Amount” : 48.4,
        “impression” : 0,
        “customField1” : “jhhgh10”,


To query everything: (Select * from table )


I would blog about MongoDB query commands in future posts.

Application owner wanted to move application to MongoDB now but project was put on hold as there are several other challenges comes into picture when working with NoSQL dbs.

Argument continues :-)

To conclude here is my journey to try to load JSON in Sql Server. Quite a few other RDBMS already have functionality to load JSON data . I hope MS would add this feature in Sql Server soon as .Net framework already have several JSON related classes.


Categories: Uncategorized Tags: , ,

Load Excel Spreadsheet to table.Tool that I build

October 15, 2012 Leave a comment

Very often we are asked to load data from excel spreadsheets to DB for data analysis or some data fix etc. There are several out of the box ways to load data in db like
SSIS , bulk insert and now powershell.  All these methods works fine and do the job but it is quite a bit of learning overhead for non-db guys . Other problem is permission
, as almost every organization developers do not have write access beyond Dev env. So it comes to DBA’s to do the job which becomes PITA soon as request keeps pouring from
developers :-) . To avoid these situations and help developers load data from excel to table in DB I build one small tool in C# .

We allow developer to load data from excel to tables in one specific db which can later be used to analyze data by joining with primay DB. We build process to clean up these temp tables after 15 days.

Here is screenshot :


All the fields in above form are self expalanatory. I am using 2 different provider here “Microsoft.Jet.OLEDB.4.0” and “Microsoft.ACE.OLEDB.12.0” ..If you don’t have
latest version of MS Office( 2010 ) then it might be the case that you don’t have latest driver. You need to download and install AccessDatabaseEngine_x64.exe from below location.


If you don’t want to download and install latest drive than you need to select “Live in Cave”(No offense :-)) option. This would make tool use Jet Ole DB provider for loading excel file.But there is one caveat with this option selected. You can’t load excel files with .xlsx extension . With .xlsx extension excel files are stored as xml which Jet Ole DB won’t recognize. Work around would be to open .xlsx file and save it as .xls ..You should be able to use tool to load .xls file then.

If you don’t select option “Existing Table” then this tool will create new table in db with all the columns as Varchar(20). If first row in excel sheet doesn’t have column name then arbitrary column names are used (F1, F2 etc.) for new table..

Since we have db’s in multiple domain with no trust between them so we allow only sql authentication to load data . I disabled Windows Authentication in tool for now which I would enable in next version for general use..

This tool has really helped DBA’s and Developers to quickly get data in DB for data fix or analysis so give it a shot and see if it help you the way it did to us.

Disclaimer :- This tool has been tested and used on several machines successfully but still there is no guarantee that it would work fine on all the machines since lot of env related factors play role during execution. Though I have not seen any security related issues while using this but use in production db is not recommended.

App can be downloaded from here 

Happy Loading


Categories: Sql Server, Tool

Find and Replace string using Powershell

August 23, 2012 Leave a comment

This would a quick post on find and replace text in ssis config files using powershell. Recently we upgraded and moved our db’s which required to modify
config files to point to new server names. Since we are using xml config file so it is tedious to go and check every config file and modify server name.

If you have 1-2 config files then it is not a problem but if you more than 50 then it doesn’t make any sense going into each file and modifying it manually.

Again Powershell to the rescue.. Powershell makes it super easy to do this in one line of code.

Here is powershell script :

get-childitem -recurse -include “*.dtsconfig” | select-string “OLDSERVERNAME” -list | % { (get-content $_.path) | % { $_ -replace “OLDSERVERNAME” , “NEWSERVERNAME” } | set-content $_.path | write-host “Processed :- “+$_}

-recurse is used to recurse through all the folders in directory.
-include is used to look into file with “dtsconfig” extension only.

If you just want to get file name that will be modified . You can run below command.

get-childitem -recurse -include “*.dtsconfig” | select-string “OLDSERVERNAME” | select Path

This would search for string and print complete path and name of  file.

You can also group by filename and find out number of times that string matches in each file.

get-childitem -recurse -include “*.dtsconfig” | select-string “OLDSERVERNAME” | group-object path | select count , Name | format-table -auto

So here it is . Powershell again makes life easy.


Categories: PowerShell, SSIS

Table MetaData for Query Tuning – Quick and Dirty

August 7, 2012 Leave a comment

Important part of life as DBA is responding qucickly during any disaster or performance issues. As much it sucks it is really challenging and interesting at times . You have to be ready always with all the tools\scripts in arsenal to identify and troubleshoot problem. So this entry again related to tuning as has been some of my previous posts.

Several times I get scripts from developer with subjects “running slow. please check” . While looking into it I try to get as much information about some basic stats related to tables in query. This information includes index fragmentation , index stats info, indexes and indexed columns. If query is small and includes 1\2 tables then this tasks is easier as just select table and hit shortcut(I heavily use shortcuts) that presents all needed data. But if query is complex and involve several tables then it becomes tiresome to get this information for each and every table in same output.

I created below process to take care of that. Created SP that takes sql text as parameter which executes query text using sp_executesql ( tip- using set fmtonly on to discard any output). This would cache the plan. Finally I would query cached plan for recently executed sql text to get table names . Once I have table names I query metadata views to grab all the needed information.

Here is how it works.

Lets create sample tables and load some dummy data.

Create table Tableone(Id int identity(1,1) primary key ,item int ,dt datetime)
Create Table Tabletwo(Id int primary key ,price decimal(10,3) , dt1 datetime)

Below I am using Merge to load random data in both tables simultaneously.

Declare @i int , @d int
set nocount on
Set @i = 1

While @i <= 10000
Merge TableOne as target
Using (values (1)) as src (val)
on 1=2
when not matched
then insert (item , dt) values (abs(checksum(rand())),dateadd(second,abs(checksum(rand()))/10,’20060101′))
output inserted.id,cast(abs(checksum(rand()))/100000000.0 as decimal(10,3)) , inserted.dt
into Tabletwo;
set @i +=1

Create couple of indexes on both the tables to test our meta data query output.

create nonclustered index ncx_tableone on tableone(item ,dt)
create nonclustered index ncx_tabletwo_incl on tabletwo(dt1 desc ,price) include(id)

Finally Create below SP to execute sql text and query cache to fetch table related meta data.

CREATE Proc nsp_getTableMetaData @sql nvarchar(4000)
Set nocount on

Declare @sql1 nvarchar(4000) = @sql

–Execute sql text to generate query plan

Exec sp_executesql @sql1

into #t from sys.dm_exec_query_stats qs
cross apply sys.dm_exec_sql_text(plan_handle) qt
cross apply sys.dm_exec_query_plan(plan_handle) qp
where qt.text like ‘%’+@sql1 +’%’

–Store table names in temp table for meta data query.
;with xmlnamespaces (default ‘http://schemas.microsoft.com/sqlserver/2004/07/showplan&#8217;)

select distinct tbl.value(N’@Table’,N’sysname’) as [TableName] into #t2 from #t
cross apply #t.query_plan.nodes(‘//RelOp[./*/Object/@Database]’) as qp(obj)
cross apply qp.obj.nodes(‘./*/Object’) as obj(tbl)

Select x.[TableName] as TableName ,i.name as index_name,STATS_DATE(i.object_id, i.index_id) AS statistics_update_date
,f.avg_fragmentation_in_percent,stuff(IndexCols.Colname , 1,1,”) ,stuff(IndexIncCols.Colname , 1,1,”), f.page_count
from sys.indexes i
Inner Join (select [TableName] from #t2) as x on i.object_id = Object_id(x.TableName)
Cross apply dbo.my_index_physical_stats(DB_ID(),Object_id(x.TableName),i.index_id,null,NULL) as f 
CROSS APPLY (SELECT ‘, ‘ + cl.name +
CASE inc.is_descending_key
FROM sys.index_columns inc
JOIN sys.columns cl ON inc.object_id = cl.object_id
AND inc.column_id = cl.column_id
WHERE inc.object_id = i.object_id
AND inc.index_id = i.index_id
AND inc.is_included_column = 0
ORDER BY inc.key_ordinal
FOR XML PATH(”)) AS IndexCols(Colname)
OUTER APPLY (SELECT ‘, ‘ + cl.name
FROM sys.index_columns inc
JOIN sys.columns cl ON inc.object_id = cl.object_id
AND inc.column_id = cl.column_id
WHERE inc.object_id = i.object_id
AND inc.index_id = i.index_id
AND inc.is_included_column = 1
ORDER BY inc.index_column_id
FOR XML PATH(”)) AS IndexIncCols(Colname)
where i.type in (1,2)


I noticed that I can’t use sys.dm_db_index_physical_stats directly in query with cross apply etc. It was throwing some correlated subquery error. Binged
and found a way around in old post from Paul Randal(sqlskills.com) where Paul blogged about working around this by creating Table Values Function referring 
sys.dm_db_index_physical_stats. That solution was for 2005 version . For 2008 you just need to add one extra column in TVF. You can check above link to read more.

Here is copy of function :

CREATE FUNCTION my_index_physical_stats (
@database_id INT,
@object_id INT,
@index_id INT,
@partition_number INT,
@mode INT)
database_id SMALLINT NULL, object_id INT NULL, index_id INT NULL, partition_number INT NULL,
index_type_desc NVARCHAR(60) NULL, alloc_unit_type_desc NVARCHAR (60) NULL, index_depth TINYINT NULL,
index_level TINYINT NULL, avg_fragmentation_in_percent FLOAT NULL, fragment_count BIGINT NULL,
avg_fragment_size_in_pages FLOAT NULL, page_count BIGINT NULL, avg_page_space_used_in_percent FLOAT NULL,
record_count BIGINT NULL, ghost_record_count BIGINT NULL, version_ghost_record_count BIGINT NULL,
min_record_size_in_bytes INT NULL, max_record_size_in_bytes INT NULL, avg_record_size_in_bytes FLOAT NULL,
forwarded_record_count BIGINT NULL,compressed_page_count BIGINT NULL)
sys.dm_db_index_physical_stats (@database_id, @object_id, @index_id, @partition_number, @mode)

Finally you can test it by running below queries. Add set fmtonly option if you don’t want to see query output.

Exec nsp_getTableMetaData ‘ set fmtonly on;
declare @id int = 4 select t1.id , t1.item , t2.price , t2.dt1
from tableone t1 inner join tabletwo t2 on t1.id = t2.id
where t1.id = @id ; set fmtonly off;’

You can even run this with parameter values.

Exec nsp_getTableMetaData ‘declare @id int = 4 select t1.id , t1.item , t2.price , t2.dt1
from tableone t1 inner join tabletwo t2 on t1.id = t2.id
where t1.id = @id;’


Exec nsp_getTableMetaData ‘select t1.id , t1.item , t2.price , t2.dt1
from tableone t1 inner join tabletwo t2 on t1.id = t2.id
where t1.id = 6 ;’

This has helped me a lot with my day to day query performance troubleshooting . It gets me basic troubleshooting information fast enough to fix it or dig deeper in query if nothing wrong with indexes or stats.

I would enhance it to include other statistics related to query like logical reads etc.

Check this out and let me know what do you think.


Categories: Uncategorized Tags:

Script Table Index in SSMS Using Powershell

July 17, 2012 Leave a comment

Most important part of DBA’s job is to regurlarly tune db queries .When you are tuning queries you want to know indexes created on table with column names etc . This tip is all about getting that information from DB quickly.

There are several ways to get this information. You can expand table in ssms and script indexes manually which becomes tedious locating table if you have thousands of tables :-) as in my case or you can create your own sp querying system tables and set it up as shortcut key which works great but as I have been playing with powershell lately so I tried to do same using powershell. But there is one small caveat. If you want to run powershell script from sql server you will have to use xp_cmdshell command. xp_cmdshell is disabled by default so you may need to enable it first by running below command :

Exec sp_configure ‘xp_cmdshell‘ , 1;


Reconfigure with override


I created below powershell script and saved it on server and saved it as script_indexes.ps1 .

param([string]$Server,[string]$database,[string]$table)  [reflection.assembly]::LoadWithPartialName(“Microsoft.SqlServer.Smo”) | out-null  $srv =new-object Microsoft.SqlServer.Management.Smo.Server($Server)  $db = $srv.databases[$database]  $tb = $db.tables | where-object {$_.Name -like $table}   foreach ($index in $tb.Indexes)    {      $index.script() + “GO `r`n ”    }

Finally create SP in DB which calls this powershell script and output index script in result window of SSMS.. For me this has been really fast way to see columns in index including included columns. Also it is easy if you want to modify indexes as you already have create index script ready. Here is sp code.


Sample Call — Exec nsp_script_indexes ‘TableName


CREATE PROCEDURE nsp_Script_Indexes

@TbName sysname





DECLARE @Servername AS sysname,@DbName AS sysname;

— Temp table to store powershell command output.



id           INT           IDENTITY,

Index_Script VARCHAR (MAX)


SET @Servername = @@SERVERNAME;

SET @DbName = DB_NAME();

—Change path for powershell file location for your server.

SET @cmd = ‘PowerShell.exe “G:\PowerShell\script_indexes.ps1 ‘ + ‘”‘ + @Servername + ‘“‘ + ‘ “‘ + @DbName + ‘“‘ + ‘ “‘ + @TbName + ‘“‘ + ‘”‘;

–Print @cmd

INSERT INTO #output (Index_Script)

EXECUTE xp_cmdshell @cmd;

–Delete unnecessary data from temp table.

DELETE #output

WHERE  Index_Script IS NULL

OR Index_Script = ”;

–Use ‘GO’ word in script to find end of each create index.

SELECT row_number() OVER ( ORDER BY ID) AS rn,



INTO   #opt

FROM   #output

WHERE  ltrim(rtrim(Index_Script)) = ‘GO‘;

DECLARE @i AS INT, @c AS INT, @ind AS NVARCHAR (1000), @d AS INT, @n AS INT;

SET @d = 0;

–Finally concatenate index script back and print.

SELECT @i = MIN(rn),

@c = max(rn)

FROM   #opt;

WHILE (@i <= @c)


SET @n = @i;


FROM   #opt

WHERE  rn = @i;

SELECT @ind = (SELECT CONVERT (VARCHAR (1000), Index_Script + ”)

FROM   #output

WHERE  id < @n

AND id > @d

FOR    XML PATH (”));

SET @d = @n;

SET @i = @i + 1;

PRINT @ind + char(10) + ‘GO‘;



Now you can add this SP call as an shortcut key in SSMS and you are ready to quickly get index info while troubleshooting queries.

Sample Call —  Exec nsp_script_indexes ‘TableName’

Happy tuning..


Categories: Sql Server

Avoid Trigger Execution – Special Cases

June 12, 2012 Leave a comment

Quick blog post on how to avoid trigger execution during some adhoc activity.

Today I needed to update around 12K rows in table which further had after update trigger. As a special case we didn’t want trigger to be executed after update. I knew I could do it by creating temp table in update  statement and then alter trigger to return if that temp table exists.. We tried that and it worked great.

But I still binge’d to see if there is any other way I could accomplish this . I saw same question in one forum and saw Context_Info() function mentioned as answer. But didn’t see any example code in the post so just for testing I ran couple of script to verify it and it works great too and also much cleaner way to get it done.

Scripts :



INSERT INTO trtable (ID, gn)

SELECT 1, ‘ab’

SELECT 2, ‘bc’

SELECT 3, ‘cd’


–SELECT * FROM trtable

Trigger Script :

ON trtable
declare @c varbinary
set @c = CONTEXT_INFO()

If @c = 0x01

EXEC msdb.dbo.sp_send_dbmail
@profile_name = ‘local’,
@recipients = ‘njandwani@local.com’,
@body = ‘Trigger executed.’,
@subject = ‘blah-blah-and-blah’;

Now if you run below dml on trtable it won’t execute trigger code (In my trigger email part).

INSERT INTO trtable(ID, Gn)

SELECT 4, ‘ef’

So here it is. Another way to skip trigger execution in some special cases. I didn’t get time to do further testing about behaviour in explicit transaction
or behaviour in other type of triggers. Will try to check it out if got time.


Categories: Uncategorized

DDL Trigger to Automate Permission

June 8, 2012 Leave a comment

This bit is related to object permission in DB and how did we automate it.

In our DB env we use role based security for every db .Whenever any SP or function is deployed in DB we grant appropriate permission
to role in that DB. Direct access to tables\views is not granted to application logins. We needed to come with solution to grant appropriate
permission when ever new sp\function is created in DB. So DDL triggers come to rescue.

Here is small script I deployed in my DB so that we don’t miss permission grant to new objects. Works great for us. No more worries about permission.

DECLARE @ObjectName AS NVARCHAR (100),
@fntype AS VARCHAR (5),
@EventType AS NVARCHAR (50),
@Grant AS NVARCHAR (300);

SET @EventType = EVENTDATA().value(‘(/EVENT_INSTANCE/EventType)[1]’, ‘nvarchar(50)’);

SELECT @fntype = type FROM sys.objects
WHERE name = EVENTDATA().value(‘(/EVENT_INSTANCE/ObjectName)[1]’, ‘nvarchar(100)’);

SELECT @ObjectName = EVENTDATA().value(‘(/EVENT_INSTANCE/ObjectName)[1]’, ‘nvarchar(100)’);

SELECT @Grant = ‘Grant Execute on ‘ + @ObjectName + ‘ TO DBRole’;
SELECT @Grant = CASE @fntype
WHEN ‘FN’ THEN ‘Grant Execute on ‘ + @ObjectName + ‘ TO DBRole’
WHEN ‘IF’ THEN ‘Grant Select on ‘ + @ObjectName + ‘ TO DBRole’ ELSE ‘None’
–Print @Grant
IF @Grant <> ‘NONE’
EXECUTE (@Grant);
RAISERROR (‘Error while Granting permission to the proc %s’, 16, 1, @ObjectName);

If you have multiple DBs then this needs to be deployed to every DB you need automated permission grant. You can generate separate script for every
DB or even can keep one script for every DB. For one script you might need to modify above script by adding variable for rolename and conditionally
setting value of this variable based on DB_Name() function output. Here is updated script.

DECLARE @ObjectName AS NVARCHAR (100),
@fntype AS VARCHAR (5),
@EventType AS NVARCHAR (50),
@Grant AS NVARCHAR (300),
@RoleName AS NVARCHAR (20);

SET @EventType = EVENTDATA().value(‘(/EVENT_INSTANCE/EventType)[1]’, ‘nvarchar(50)’);

SELECT @fntype = type FROM sys.objects
WHERE name = EVENTDATA().value(‘(/EVENT_INSTANCE/ObjectName)[1]’, ‘nvarchar(100)’);

SELECT @ObjectName = EVENTDATA().value(‘(/EVENT_INSTANCE/ObjectName)[1]’, ‘nvarchar(100)’);

IF DB_Name() = ‘DB1’
SET @RoleName = ‘DB1Role’;
IF DB_Name() = ‘DB2’
SET @RoleName = ‘DB2Role’;
SET @RoleName = ‘DBRole’;
SELECT @Grant = ‘Grant Execute on ‘ + @ObjectName + ‘ TO ‘ + @RoleName + ‘;’;
SELECT @Grant = CASE @fntype
WHEN ‘FN’ THEN ‘Grant Execute on ‘ + @ObjectName + ‘ TO ‘ + @RoleName + ‘;’
WHEN ‘IF’ THEN ‘Grant Select on ‘ + @ObjectName + ‘ TO ‘ + @RoleName + ‘;’ ELSE ‘None’
PRINT @Grant;
IF @Grant <> ‘NONE’
EXECUTE (@Grant);
RAISERROR (‘Error while Granting permission to the proc %s’, 16, 1, @ObjectName);

Now we can add all kind of fancy stuff here. For eg. you can send all this information including grant statement to logging table or grant permission
to other login\role if needed. Couple of small tweaks to above script will do that for you..

So here is my simple method to automate permission grant for some common DB routines. There are several methods to do this so use whatever works best
for you.

Any comments , suggestions , improvements welcome.


Categories: Sql Server

Get every new post delivered to your Inbox.