Archive

Archive for the ‘Sql Server’ Category

Using R to load CSV and Excel file in Sql Server 2016

November 14, 2015 Leave a comment

It’s been more than two and half years I blogged last time .¬†Lot of changes personally and professionally ūüôā . Finally had some time to¬†learn and test some of the new features coming up in Sql Server 2016.

As soon as I heard about “Integration of R” as new features in Sql Server 2016 I couldn’t resist trying it out for some of the frequent and¬†important task I do when I have to analyze data . Loading .csv or .xlsx file data in Sql server. Loading .csv file and .xlsx in Sql Server was always possible through tsql (bulk insert) ¬†, SSIS or¬†Import/Export but there were always some hoops to go through when loading .xlsx type of data. With introduction of R in Sql Server I think we have one more easy way to load these type of files. Lets see how it works.

“Assuming you have already installed Sql Server 2016 CTP 3 and run all post installation configurations for R.”

I would load baseball.csv file in sql server first. With introduction of R in Sql Server 2016 we can execute R scripts directly from SSMS using sp sp_execute_external_script like this:

execute sp_execute_external_script
@language = N’R’
, @script = N’ setwd(”D:/csvdirectory”);
mytextvariable <- read.csv(“baseball.csv”,header = TRUE);
OutputDataSet <- as.data.frame(mytextvariable);’
, @input_data_1 = N’ ‘
WITH RESULT SETS (([Team] Char(5) not null, [League] char(3) NOT NULL, [Year] int not NULL , Win int not null, Playoffs bit not null ));

I am using 3 parameters here:

@Language to specify the script language “R”.

@script is the parameter where all R code goes .. This is where I included my basic R  script. In this script I first change working directory to directory where baseball.csv file exists. read.csv() is R function to to read csv file . Finally I convert csv file data to data frame and pass it to OutputDataSet variable. Since we generate values using just the R script so we leave the input query string in @input_data_1 blank as this is required parameter. We can also type any valid sql statement. This will not be executed by R anyways.

Result sets is where I create column names . I simply created the same column name as in .csv file.  When this script executes the values defined by the WITH RESULT SETS statement are returned in the Results pane in SSMS . Here is the screen shot.

rscript_output

 

Now what if I want to load this data in table for further analysis or to join with other tables. I couldn’t find any published way to do this so thought of trying it out myself. First thing came to mind was Insert…Exec method for¬†loading stored procedure results in table. And after couple of dynamic sql and single quotes related issue it worked finally ūüôā

Here is how it works:

I created temp table to store data. This could be permanent table too.

create table #t ([Team] Char(5) not null, [League] char(3) NOT NULL, [Year] int not NULL , Win int not null, Playoffs bit not null );

After that I insert into temp table and execute R script as dynamic sql. it’s same script as above but executed¬†as dynamic sql like this:

Insert into #t
Execute
(‘
execute sp_execute_external_script
@language = N”R”
, @script = N” setwd(“D:/csvdirectory”);
mytextvariable <- read.csv(“baseball.csv”,header = TRUE);
OutputDataSet <- as.data.frame(mytextvariable);”
, @input_data_1 = N” ;”
WITH RESULT SETS (([Team] Char(5) not null, [League] char(3) NOT NULL, [Year] int not NULL , Win int not null, Playoffs bit not null ));’
);

select * from #t;

Voila. Data is loaded in table. Now we can do all kind of analysis or use it whatever way we need to.

I am going to use it a lot to¬†automate .csv file or .xlsx file upload to table . I didn’t check with huge csv or xlsx files but I think some performance penalty is ok for my scenarios¬†looking at other benefits we have with this approach.

Loading .xlsx file requires to load xlsx library in the R script else it won’t work. I had to install xlsx package first into my R environment then it is same script as above with extra line of code to load xlsx library (highlighted in red).

Here is script to load excel files :

Insert into #t
Execute
(‘
execute sp_execute_external_script
@language = N”R”
, @script = N” setwd(“R:/DBA”);
library(xlsx);
mytextvariable <- read.xlsx(“baseball.xlsx”, sheetName=”Sheet1″);
OutputDataSet <- as.data.frame(mytextvariable);”
, @input_data_1 = N” ;”
WITH RESULT SETS (([Team] Char(5) not null, [League] char(3) NOT NULL, [Year] int not NULL , Win int not null, Playoffs bit not null ));’
);

That’s it . I am really excited to see lot of cool features in Sql Server 2016 . ¬†Some of these features are definitely going to help me a lot in my day to day work . Couple of features I am looking forward to learn and work are R in Sql Server and load\query JSON format data .

Will continue to test and learn some more R related Sql Server features and blog here.

Thanks,

NJ

 

 

Advertisements
Categories: Sql Server Tags:

Load Excel Spreadsheet to table.Tool that I build

October 15, 2012 Leave a comment

Very often we are asked to load data from excel spreadsheets to DB for data analysis or some data fix etc. There are several out of the box ways to load data in db like
SSIS , bulk insert and now powershell.  All these methods works fine and do the job but it is quite a bit of learning overhead for non-db guys . Other problem is permission
, as almost every organization developers do not have write access beyond Dev env. So it comes to DBA’s to do the job which becomes PITA soon as request keeps pouring from
developers ūüôā . To avoid these situations and help developers load data from excel to table in DB I build one small tool in C# .

We allow developer to load data from excel to tables in one specific db which can later be used to analyze data by joining with primay DB. We build process to clean up these temp tables after 15 days.

Here is screenshot :

Image

All the fields in above form are self expalanatory. I am using 2 different provider here “Microsoft.Jet.OLEDB.4.0” and “Microsoft.ACE.OLEDB.12.0” ..If you don’t have
latest version of MS Office( 2010 ) then it might be the case that you don’t have latest driver. You need to download and install AccessDatabaseEngine_x64.exe from below¬†location.

http://www.microsoft.com/en-us/download/details.aspx?id=13255

If you don’t want to download and install latest drive than you need to select “Live in Cave”(No offense :-)) option. This would make tool use Jet Ole DB provider for loading excel file.But there is one caveat with this option selected. You can’t load excel files with .xlsx extension . With .xlsx extension excel files are stored as xml which Jet Ole DB won’t recognize. Work around would be to¬†open .xlsx file and save it as .xls ..You should be able to use tool to load .xls file then.

If you don’t select option “Existing Table” then this tool will create new table in db with all the columns as Varchar(20). If first row in excel sheet doesn’t have¬†column name then arbitrary column names are used (F1, F2 etc.) for new table..

Since we have db’s in multiple domain with no trust between them so we allow only sql¬†authentication to load data . I disabled Windows Authentication in tool for now which I would enable in next version for general use..

This tool has really helped DBA’s and Developers to quickly get data in DB for data fix or analysis so give it a shot and see if it help you the way it did to us.

Disclaimer :- This tool has been tested and used on several machines successfully but still there is no guarantee that it would work fine on all the machines since lot of env related factors play role during execution. Though I have not seen any security related issues while using this but use in production db is not recommended.

App can be downloaded from here 

Happy Loading

-Neeraj

Categories: Sql Server, Tool

Script Table Index in SSMS Using Powershell

July 17, 2012 Leave a comment

Most important part of DBA’s job is to regurlarly tune db queries .When you are tuning queries you want to know indexes created on table with column names etc . This tip is all about getting that information from DB quickly.

There are several ways to get this information. You can expand table in ssms and script indexes manually which becomes tedious locating table if you have thousands of tables ūüôā as in my case or you can create your own sp querying system tables and set it up as shortcut key which works great but as I have been playing with powershell lately so I tried to do same using powershell. But there is one small caveat. If you want to run powershell script from sql server you will have to use xp_cmdshell command. xp_cmdshell is disabled by default so you may need to enable it first by running below command :

Exec sp_configure ‘xp_cmdshell‘ , 1;

go

Reconfigure with override

go

I created below powershell script and saved it on server and saved it as script_indexes.ps1 .

param([string]$Server,[string]$database,[string]$table) ¬†[reflection.assembly]::LoadWithPartialName(“Microsoft.SqlServer.Smo”) | out-null ¬†$srv =new-object Microsoft.SqlServer.Management.Smo.Server($Server) ¬†$db = $srv.databases[$database] ¬†$tb = $db.tables | where-object {$_.Name -like $table} ¬† foreach ($index in $tb.Indexes) ¬†¬† { ¬†¬†¬†¬† $index.script() + “GO `r`n ” ¬†¬† }

Finally create SP in DB which calls this powershell script and output index script in result window of SSMS.. For me this has been really fast way to see columns in index including included columns. Also it is easy if you want to modify indexes as you already have create index script ready. Here is sp code.

/***********************************************************

Sample Call — Exec nsp_script_indexes ‘TableName

************************************************************/

CREATE PROCEDURE nsp_Script_Indexes

@TbName sysname

AS

BEGIN

SET NOCOUNT ON;

DECLARE @cmd AS NVARCHAR (500);

DECLARE @Servername AS sysname,@DbName AS sysname;

— Temp table to store powershell command output.

CREATE TABLE #output

(

id           INT           IDENTITY,

Index_Script VARCHAR (MAX)

);

SET @Servername = @@SERVERNAME;

SET @DbName = DB_NAME();

—Change path for powershell file location for your server.

SET @cmd = ‘PowerShell.exe “G:\PowerShell\script_indexes.ps1 ‘ + ‘”‘ + @Servername + ‘“‘ + ‘ “‘ + @DbName + ‘“‘ + ‘ “‘ + @TbName + ‘“‘ + ‘”‘;

–Print @cmd

INSERT INTO #output (Index_Script)

EXECUTE xp_cmdshell @cmd;

–Delete unnecessary data from temp table.

DELETE #output

WHERE  Index_Script IS NULL

OR Index_Script = ”;

–Use ‘GO’ word in script to find end of each create index.

SELECT row_number() OVER ( ORDER BY ID) AS rn,

ID,

Index_Script

INTO   #opt

FROM   #output

WHERE¬† ltrim(rtrim(Index_Script)) = ‘GO‘;

DECLARE @i AS INT, @c AS INT, @ind AS NVARCHAR (1000), @d AS INT, @n AS INT;

SET @d = 0;

–Finally concatenate index script back and print.

SELECT @i = MIN(rn),

@c = max(rn)

FROM   #opt;

WHILE (@i <= @c)

BEGIN

SET @n = @i;

SELECT @n = ID

FROM   #opt

WHERE  rn = @i;

SELECT @ind = (SELECT CONVERT (VARCHAR (1000), Index_Script + ”)

FROM   #output

WHERE  id < @n

AND id > @d

FOR¬†¬†¬† XML PATH (”));

SET @d = @n;

SET @i = @i + 1;

PRINT @ind + char(10) + ‘GO‘;

END

END

Now you can add this SP call as an shortcut key in SSMS and you are ready to quickly get index info while troubleshooting queries.

Sample Call —¬† Exec nsp_script_indexes ‘TableName’

Happy tuning..

-NJ

Categories: Sql Server

DDL Trigger to Automate Permission

June 8, 2012 Leave a comment

This bit is related to object permission in DB and how did we automate it.

In our DB env we use role based security for every db .Whenever any SP or function is deployed in DB we grant appropriate permission
to role in that DB. Direct access to tables\views is not granted to application logins. We needed to come with solution to grant appropriate
permission when ever new sp\function is created in DB. So DDL triggers come to rescue.

Here is small script I deployed in my DB so that we don’t miss permission grant to new objects. Works great for us. No more worries about permission.

CREATE TRIGGER Trg_Module_Perm_DDL
ON DATABASE
FOR CREATE_FUNCTION, CREATE_PROCEDURE
AS BEGIN
SET NOCOUNT ON;
DECLARE @ObjectName AS NVARCHAR (100),
@fntype AS VARCHAR (5),
@EventType AS NVARCHAR (50),
@Grant AS NVARCHAR (300);

SET @EventType = EVENTDATA().value(‘(/EVENT_INSTANCE/EventType)[1]’, ‘nvarchar(50)’);

SELECT @fntype = type FROM sys.objects
WHERE name = EVENTDATA().value(‘(/EVENT_INSTANCE/ObjectName)[1]’, ‘nvarchar(100)’);

SELECT @ObjectName = EVENTDATA().value(‘(/EVENT_INSTANCE/ObjectName)[1]’, ‘nvarchar(100)’);

IF @EventType = ‘CREATE_PROCEDURE’
BEGIN
SELECT @Grant = ‘Grant Execute on ‘ + @ObjectName + ‘ TO DBRole’;
END
IF @EventType = ‘CREATE_FUNCTION’
BEGIN
SELECT @Grant = CASE @fntype
WHEN ‘FN’ THEN ‘Grant Execute on ‘ + @ObjectName + ‘ TO DBRole’
WHEN ‘IF’ THEN ‘Grant Select on ‘ + @ObjectName + ‘ TO DBRole’ ELSE ‘None’
END;
END
BEGIN TRY
–Print @Grant
IF @Grant <> ‘NONE’
BEGIN
EXECUTE (@Grant);
END
END TRY
BEGIN CATCH
RAISERROR (‘Error while Granting permission to the proc %s’, 16, 1, @ObjectName);
ROLLBACK;
END CATCH
END

If you have multiple DBs then this needs to be deployed to every DB you need automated permission grant. You can generate separate script for every
DB or even can keep one script for every DB. For one script you might need to modify above script by adding variable for rolename and conditionally
setting value of this variable based on DB_Name() function output. Here is updated script.

CREATE TRIGGER Trg_Module_Perm_DDL
ON DATABASE
FOR CREATE_FUNCTION, CREATE_PROCEDURE
AS BEGIN
SET NOCOUNT ON;
DECLARE @ObjectName AS NVARCHAR (100),
@fntype AS VARCHAR (5),
@EventType AS NVARCHAR (50),
@Grant AS NVARCHAR (300),
@RoleName AS NVARCHAR (20);

SET @EventType = EVENTDATA().value(‘(/EVENT_INSTANCE/EventType)[1]’, ‘nvarchar(50)’);

SELECT @fntype = type FROM sys.objects
WHERE name = EVENTDATA().value(‘(/EVENT_INSTANCE/ObjectName)[1]’, ‘nvarchar(100)’);

SELECT @ObjectName = EVENTDATA().value(‘(/EVENT_INSTANCE/ObjectName)[1]’, ‘nvarchar(100)’);

IF DB_Name() = ‘DB1’
SET @RoleName = ‘DB1Role’;
ELSE
IF DB_Name() = ‘DB2’
SET @RoleName = ‘DB2Role’;
ELSE
SET @RoleName = ‘DBRole’;
IF @EventType = ‘CREATE_PROCEDURE’
BEGIN
SELECT @Grant = ‘Grant Execute on ‘ + @ObjectName + ‘ TO ‘ + @RoleName + ‘;’;
END
IF @EventType = ‘CREATE_FUNCTION’
BEGIN
SELECT @Grant = CASE @fntype
WHEN ‘FN’ THEN ‘Grant Execute on ‘ + @ObjectName + ‘ TO ‘ + @RoleName + ‘;’
WHEN ‘IF’ THEN ‘Grant Select on ‘ + @ObjectName + ‘ TO ‘ + @RoleName + ‘;’ ELSE ‘None’
END;
END
BEGIN TRY
PRINT @Grant;
IF @Grant <> ‘NONE’
BEGIN
EXECUTE (@Grant);
END
END TRY
BEGIN CATCH
RAISERROR (‘Error while Granting permission to the proc %s’, 16, 1, @ObjectName);
ROLLBACK;
END CATCH
END

Now we can add all kind of fancy stuff here. For eg. you can send all this information including grant statement to logging table or grant permission
to other login\role if needed. Couple of small tweaks to above script will do that for you..

So here is my simple method to automate permission grant for some common DB routines. There are several methods to do this so use whatever works best
for you.

Any comments , suggestions , improvements welcome.

–Neeraj

Categories: Sql Server