SQLWebcast2 Part 3 of 3 from sqlworkshops.com

This is a webcast courtesy of SQLWorkshops.com 

Note: Doubleclick video for fullscreen.

In part 3 of this webcast Ramesh Meyyappan discusses some performance tips and tricks to use while rewriting queries and helps you to optimize query executions according to CXPACKET wait time and parallel query executions.

[mediaplayer src='http://sqltechconsulting.com/video/SQLWebcast2Part3of3.wmv' autoLoad=1 ]

SQLWebcast2 Part 2 of 3 from sqlworkshops.com

This is a webcast courtesy of SQLWorkshops.com 

Note: Doubleclick video for fullscreen.

In part 2 of this webcast Ramesh Meyyappan discusses optimizing queries according to CXPACKET wait time and distribution of rowcounts per cpu core.

[mediaplayer src='http://sqltechconsulting.com/video/SQLWebcast2Part2of3.wmv' autoLoad=1 ]

SQLWebcast2 Part 1 of 3 from sqlworkshops.com

This is a webcast courtesy of SQLWorkshops.com 

Note: Doubleclick video for fullscreen.

In part 1 of this webcast Ramesh Meyyappan discusses parallel query executions and CXPACKET waits.

[mediaplayer src='http://sqltechconsulting.com/video/SQLWebcast2Part1of3.wmv' autoLoad=1 ]

SQLWebcast1 Part 3 of 3 from sqlworkshops.com

This is a webcast courtesy of SQLWorkshops.com 

Note: Doubleclick video for fullscreen.

In part 3 of this webcast Ramesh Meyyappan discusses how you can modify some queries to prevent sort spill overs to tempdb and he includes some tips you can use to do the same in certain situations.

[mediaplayer src='http://sqltechconsulting.com/video/SQLWebcast1Part3of3.wmv' autoLoad=1 ]

SQLWebcast1 Part 2 of 3 from sqlworkshops.com

This is a webcast courtesy of SQLWorkshops.com 

Note: Doubleclick video for fullscreen.

In part 2 of this webcast Ramesh Meyyappan discusses how certain best practices do not always help performance.  Sometimes it even hurts.  One best practice he discusses is the practice of adding data files to tempdb according to the amount of cpu cores you have on the server.  He explains how optimizing memory spill overs of sort operations into tempdb makes more of an improvement and in some cases adding data files to tempdb actually hurts performance.
[mediaplayer src='http://sqltechconsulting.com/video/SQLWebcast1Part2of3.wmv' autoLoad=1 ]

SQLWebcast1 Part 1 of 3 from sqlworkshops.com

This is a webcast courtesy of SQLWorkshops.com 

Note: Doubleclick video for fullscreen.

In part 1 of this webcast Ramesh Meyyappan discusses sql server query executions with sorts over tempdb vs sorts over memory and how you can modify queries for better performance.
[mediaplayer src='http://sqltechconsulting.com/video/SQLWebcast1Part1of3.wmv' autoLoad=1 ]

Running Large Serialized Transactions

I was loading  a very large amount of data into a test machine today.  I had the machine completely to myself, I had a backup before the process began and I had a profiler trace running on another machine to monitor every statement processed.  The load was all part of a serialized transaction.  The machine was running SQL Server 2005 Standard edition on a dual core server with 16GB of memory.  It’s not the best machine but for testing environment its more then competant.  So we have a pretty isolated environment and you would think everything should be pretty straight forward. 

Wouldn’t you know it?  Halfway through the 8 hour process, the machine became unresponsive on me.   I couldn’t connect via ssms, DAC, SQLCMD, nothing.  The machine was pegged at 100% CPU, the memory was at 100% as well.  The machine was basically shot and rolling down a hill.  The last statement I saw from my profiler was 20 minutes ago…just got back from lunch by the way, and oh boy….what a day this was working out to be.  And my boss just sent me an email asking how things were going.  I am going to explain how I went about solving this issue but I just want to let you know that this is not something that you can always do, but when in a bind and during certain conditions it is a pretty nifty solution and is why I thought it would make for an interesting article.  I was in a bind and I was trying to perform the action in the fastest possible time because people were waiting for the server, and they really needed to start doing their testing.  Here is what I did:

Since I was unable to connect to the SQL Server instance whatsoever, I had to do some drastic measures.  I had a backup and I was the only one on the server so I was not concerned with data loss.  So I just decided to kill the SQL Server service from the services control panel.  Wouldn’t you know it this hung on me too.   So being the cowboy I am, I just went to task manager and killed the sqlserver.exe while I was at it.  Now, knowing that this is a completely abrupt and unorthodox thing to do, I knew all the dbs were going to have to be recovered by SQL Server when the server was restarted.  I didn’t feel like waiting so I decided to do some more unorthodox things.  I renamed one of my datafiles on the database that I was working on.  Then I restarted the service.  This of course brings the db up, but in suspect mode.  From here you can just romove the suspect db and restore from a backup.  Again this is an unorthodox way, but it works in certain situations and mine was one of them. 

So now where does this leave me.  I had my database back to square one, the server was stable again, but I still have to run this monster data load that needs to be run and figure out just what happened to cause this whole mess.  All my DMV’s are also now useless as the service was restarted and do not contain any of the information I would have had if the darn server wasn’t unresponsive and I couldn’t connect to it.  I opened up the error log on windows only to find that it contains nothing to help me.  The SQL Server error log show’d nada as well.  I did have my profiler trace and I knew the last statement that completed so I tried running the statement it failed on.  No problems…so I started thinking and finally realized the issue. 

Well, I was running a serialized transaction, and because of this it has to keep the entire transaction of updates in memory until the final commit is issued.  When the machine ran out of phicical memory it started to thrash out to disk causing the server to become unresponsive and new connections were unable to connect.  So the moral of the story is to be sure that when you are running a serialized transaction that you have enough memory to hold the entire transaction.

How to install Security Policies for your SQL Servers

I just got back from another great meeting at my local SQL Server User Group. In tonight’s meeting we had a presenter named Joe Toscano.  Here’s some info about Joe the presenter:

Joe Toscano of RDA is a SQL Server Business Intelligence consultant, instructor and speaker based near Philadelphia, Pa. He has worked extensively with SQL Server since version 6.5, as well as Sybase and Informix prior to that. Joe’s areas of expertise include Data Transformation Services (DTS), Integration Services (SSIS), Transact-SQL (T-SQL) programming, data warehousing, performance tuning, replication, high availability (mirroring), security, disaster recovery, upgrading, migration projects and lights-out DBA tasks.

Joe was an absolutely wonderful presenter.  You can really see his passion and knowledge for databases by his expressions and demeanor throughout his entire presentation.  I was intrigued by his session from the beginning and he kept me extremely interested until the very end.  I thought the information I learned from him made for a great post so that’s why I am sharing that information with you all now.  So, Policy management.  What are they all about? 

Well Policy Management is a new feature in SQL Server 2008.  It allows you to define and enforce policies for configuring and managing SQL Server across the enterprise.  It requires setting up a Central Management Server (CMS) which acts as the traffic cop that monitors the other servers in your environment.  This CMS server must be SQL Server 2008, but it can monitor lower servers, like 2005 and 2000 etc.  When using the policies in 2008 and 2005, the policies actively prevent the conditions from occurring by issuing errors via ddl triggers.  With the servers pre 2005 they didn’t have this functionality yet so it is more of an after the fact implementation.  It’s still a great mechanism to keep track of those things none the less. 

So, before we can dive deeper into the subject and see some examples there are a number of terms that we need to define and understand regarding Policy-Based Management and those terms are:

  • Target - an entity that is managed by Policy-Based management; e.g. a database, a table, an index, etc.
  • Facet – a predefined set of properties that can be managed
  • Condition – a property expression that evaluates to True or False; i.e. the state of a Facet
  • Policy – a condition to be checked and/or enforced

I found this great step by step walkthrough with screenshots from Ray Barley on how to set up Policy-Based Management

How to Audit SSAS role permissions

A while back I was very disappointed to discover that SSAS does not have any easy way to go through all of your dimensions and cubes to see which users have access to which dimensions, cubes, etc.  Sure you can go one by one in order to get the information, but I was really just looking for an easy way to view the permissions based on individual object permissions, similar to the way you can in SSMS.  I called Microsoft and unbelievably they had no solution.  What a pinch to be in when you have an important audit coming and now it is going to take an absorbant amount of time to come up with the data the auditors are going to need.

Well, fortunately for me, I ran into an incredible site today.  It’s called ASSP – Analysis Services Stored Procedure Project This site is great!  It’s an open source project and it is just what the doctor ordered.  You can download the software and open up the project in Visual Studio.  From there you just build the ASSP assembly and you are all set.  This assembly has a number of various functions built into it and it returns a number of helpful datasets in the datasets you will be looking for.  They have also have additional reports and things pre-made and built right in to the project for you as well, so you can deploy those right to SSRS.  Here is an exerpt of one of the funtion calls and documentation:

DiscoverXmlMetaData(path, restrictions)
This function executes a DISCOVER_XML_METADATA command, which returns a hierarchical resultset and then “flattens” that resultset and returns it as a datatable which can be displayed by SSMS or used in a Reporting Services report.

Unfortunately standard Xpath style queries did not prove rich enough to return data in it’s full context. That is you could return a list of all the dimensions in every cube on a given server, but you could not return a column containing the cube and database to which they belonged. So a hybrid Xpath syntax was developed which allows you to specify a list of fields to return at each level.

The easiest way to explain this is with a few examples

In order to return return a list of all the dimensions in every cube in the current database you would issue the following command.

CALL ASSP.discoverXmlMetaData("DatabaseCubesCubeDimensionsDimension");

But this returns the list without any context, If you want to report the CubeName and LastProcessed date you would put a vertical bar (|) after the Cube element in the path and then put a comma separated list of fields that you want returned

CALL ASSP.discoverXmlMetaData("DatabaseCubesCube|Name,LastProcessedDimensionsDimension");

Note: it is not entirely true that the DiscoverXmlMetadata() command returns data without context – one of the features of this function is that it automatically returns the “ID” field of any ancestor elements in the resultset.

If you want to see what sort of paths are available the easiest way is to run the following command in an XMLA query window. You basically need to skip the xsd:schema> element and look at what is returned under the <row><xars:METADATA> elements.

<Discover xmlns="urn:schemas-microsoft-com:xml-analysis">
  <RequestType>DISCOVER_XML_METADATA</RequestType>
  <Restrictions />
  <Properties />
</Discover>

The path parameter actually matches from the leaf level up, so you could do a command like

CALL ASSP.discoverXmlMetaData("Dimension");

But this would produce a bit of a mess as it would return both cube dimensions and database dimensions and other dimension references from all over the xml resultset. The function is built in such a way that it build the columns for the resultset based off the first populated node in that matches the path so you will end up with a lot of blank columns.

But what this does mean is that you can specify just the minimal path that will unquiely identify the elements you are interested in.

eg

CALL ASSP.discoverXmlMetaData("CubeDimensionsDimension");

returns the same result as

CALL ASSP.discoverXmlMetaData("DatabaseCubesCubeDimensionsDimension");

It should be noted that because this is base on Xml, the path is case sensitive and “dimension” is not the same as “Dimension”.

Role information is not exposed by any of the other schema rowsets. The following commands will extract information about roles, dimension permissions, attribute permissions and cube permissions and would allow for the complete documentation of the security settings.

CALL ASSP.DiscoverXmlMetadata("DatabaseRolesRole");

CALL ASSP.DiscoverXmlMetadata("DatabaseDimensionsDimensionDimensionPermissionsDimensionPermission");

CALL ASSP.DiscoverXmlMetadata("DatabaseDimensionsDimensionDimensionPermissionsDimensionPermissionAttributePermissionsAttributePermission")

CALL ASSP.DiscoverXmlMetadata("DatabaseCubesCubeCubePermissionsCubePermission");

 

I am so thrilled!  This is just what I was looking for.  Check it out and see for yourselves just how great this assembly is.

#TSQL2sDay Article Submission – Join syntax

I heard about Adam Machanic’s #TSQL2sday initiative and I thought it was a fantastic idea.  For those of you not familiar with this initiative read up about TSQL2sday  Here’s a brief synopsis for this month’s topic.

Theme

Have you ever found yourself unable to figure out the intricacies of how some piece of code works? Ever been confused by the results you’ve gotten back from a query, only to find out that something totally unrelated was going on? Or have you ever been compelled to wile away your spare time working on a “challenge” posted by some blogger?

For this month’s T-SQL Tuesday, I’m asking participants to write a blog post on a “puzzling” topic, along the lines of some of the following ideas:

  • Describe a confusing situation you encountered, and explain how you debugged the problem and what the resolution was
  • Show a piece of code that doesn’t behave as most people might expect, and illustrate the reasoning behind the discrepancy
  • Create a challenge for your readers to solve

As always, even given the event’s name the posts are not limited to T-SQL! Any component of, or software product related to SQL Server, is fair game. MDX, SSIS, LINQ to SQL, Entity Data Model, NHibernate, and any other software product that deals with SQL Server data can be featured in your post. Be creative!

Rules

As before, any blogger can–and should–feel free to participate in this event. In order to make the event slightly more international, I’m changing the time range from PST-based to UTC-based. So the rules are:

  • Your post must go live between 00:00:00 UTC and 23:59:00 UTC on Tuesday, January 12, 2010
  • Your post must link back to this one, and it’s recommended that you clearly identify the post as a T-SQL Tuesday post
  • You are responsible for ensuring that a trackback or comment appears here so that I can find the posts

Follow the rules, and your post will be included in the roundup to be posted on January 13 or 14. Don’t follow the rules, and it won’t show up there. Simple as that!

My Submission

Not every SQL Server upgrade is equal.  Some are very straight forward and simple.  Yet others are just a complete bear, and have many things that have to be very carefully thought out and tested many times in order to insure things go smoothly.   The upgrade from SQL Server 2000 to SQL Server 2005 or 2008 is no exception and often times it will yield a few obstacles in your path that may slow you down some and at times it may get quite confusing in places.   To start out we may have to convert a lot of our DTS packages to SSIS packages.  This is often a chore and it is confusing in itself sometimes, but that’s not what I wanted to write about.  What I wanted to write about today are the obstacles that I often face during the conversion of the old TSQL join syntax in queries to the SQL-92 standard syntax.   With the old TSQL standard (pre-SQL Server 2005), left and right outer join conditions were able to be specified in the WHERE clause using the *= and =* operators.   In some cases,  this results in an ambiguous query that can be interpreted in more than one way. SQL-92 compliant outer joins on the other hand are specified in the FROM clause and therefore do not result in this same ambiguity.   In order to Upgrade to SQL Server 2005 or SQL Server 2008 we are forced to convert to these types of joins or we have to run our databases in a lower capability mode.  We can run the lower capability in some environments to ease the transition for a little while but eventually we will have to convert them all to the new standard.  So let’s now look at an example of how we can convert some queries to the SQL-92 standard from the old TSQL syntax.  Review the following lines of code:

select SubCategory=psc.Name, ProductName=p.Name, p.ListPrice, p.Color

from Production.Product p, Production.ProductSubcategory psc

where p.ProductSubcategoryID *= psc.ProductSubcategoryID

and Color is not null

--(256 row(s) affected)

 This query can be really simple to convert.  We just to remove the *= and move up the two fields referenced into the ON clause of the LEFT OUTER JOIN statement.  The result sets match and everything seems to be fine.

SELECT psc.Name AS SubCategory, p.Name AS ProductName, p.ListPrice, p.Color

FROM Production.Product AS p LEFT OUTER JOIN

Production.ProductSubcategory AS psc ON p.ProductSubcategoryID = psc.ProductSubcategoryID

Where (p.Color IS NOT NULL)

--(256 row(s) affected)

That wasn’t so bad, right?  Let’s now look at a little more complex example.   Concider the next few lines of code:

SELECT DISTINCT branch.branch_no,branch.branch_name,site.site_no
FROM site,
customer_site,
branch
WHERE site.site_no *= customer_site.site_no
and getdate() between customer_site.start_date and customer_site.end_date
and branch.branch_no = site.branch_no
and ( site.co_no = 1 )
ORDER BY branch.branch_no ASC, site.site_no ASC

If we did the same thing as before, everything would look to be just fine but when we actually run the query we will discover that the result sets would return different counts of records.  This is the ambiguity I was refering to earlier.  So what seems to be the problem?

SELECT DISTINCT branch.branch_no,branch.branch_name,site.site_no
FROM site
    left join customer_site
        on (site.site_no = customer_site.site_no
            and site.co_no = 1)
    inner join branch
        on (branch.branch_no = site.branch_no )
WHERE
    getdate() between customer_site.start_date and customer_site.end_date
ORDER BY
    branch.branch_no ASC,
    site.site_no ASC

The discrepancy in this example comes down to how the relations are really joined.  You see, there are two main things we always have to be mindful of when we are converting queries to the new form and they are. 

1) We need to first determine what the correct elements of the join itself are. 

2) We need to further identify the supplemental conditions have been applied to limit or filter the data returned after the initial join has taken place and decide if they are really part of the join or are applied afterwards.

What I mean by these two things is that there are certain criteria that is used to join the two relations and those are the things that need to be moved to the ON phrase of your join.  After the relations are joined there are often some additional filters applied to the result set as well that can widdle down the result set further but these may need to stay in the WHERE clause or be added to the join statement as well.  This is the confusing part.  So, for this conversion above the following would be a workable solution. 

SELECT DISTINCT branch.branch_no,branch.branch_name,site.site_no
FROM site
    left join customer_site
        on (site.site_no = customer_site.site_no
            and site.co_no = 1
            and getdate() between customer_site.start_date and customer_site.end_date)
    inner join branch
        on (branch.branch_no = site.branch_no )
ORDER BY
    branch.branch_no ASC,
    site.site_no ASC

As you can see, the SQL-92 standard really make a lot more sense when looking at the queries because if you are joining on something you do exactly that, we JOIN on it!  If you are doing additional filtering we keep that stuff in the where clause.  The real problem though is that depending on how the relations are joined in the first place you can get different result sets as in our second example above.   For inner joins this is not so much a problem but for outer joins it really comes into play.  In our example above the additional filters are actually needed in the join condition itself. Have you ever had any issues converting some of your joins to the new syntax like these?  Please let me know what they were and how you ended up solving them. 

TIP

One little trick that I have used in the past to speed up some of my conversions is to write the queries as a view in the view designer first.  By opening the view designer in SSMS it will automatically convert the queries for you.  It is not always 100% correct as the filters may need to be rearranged but it does get you pretty darn close and it makes it a bit easier when dealing with complex statements.  Try it out and let me know what you think.