Do Not Pass GO!

What is the GO statement and why is it so important to use? When do I have to use it? When do I not use it? These are questions that have passed through my head from time to time while writing T-SQL within SQL Server.

First What Is It and When Should I Use It?  

The GO statement lets SSMS (the interface) know when it’s the end of the batch. It basically defines the scope of what you are trying to send to the Database Engine. The below example sends two separate statements. The first statement changes the database context to run the next statement under, followed by the execution of the SELECT running against the database Demo. Simple, yes.

Example

Gotcha’s

I’ve been caught out by this behavior in the past. Using GO in stored procedures can be tricky. There are times when you want to run a batch of statements together, but if you put a GO into the procedure and compile it you will notice that you lost any code that came after the GO. The GO signaled to that my ALTER or CREATE Procedure statement was done. It then ignored all the statement below it as part of the stored procedure.

Another Gotcha which can be both good and bad depending on your need. A Variable’s life span ends after each GO statement. If you declare a variable, run a statement to populate that variable and use that variable you can no longer use it once you send a GO.

Example

Cool things to do with GO

This is learned by chance just messing round. Did you know that if you put a number after GO it will run those statements that many times? This can be handy for generating a lot of load against a database for demos.

Don’t like the word go, change it. Yep you can change it to anything you want. Tool> Options> Query Execution

Change it to RUNNOW.

Let’s Try

HMMM Why didn’t that work… because I ran it in an existing Open Window (Session).  Let’s try that again.

TADA! Much better.

Now that you know what it does, feel free to advance to GO and collect your $200. Enjoy.

What Are These Backup Settings All About?

I ran across a client the other day that had these Backup and Recovery options set like the picture below because it is defaulted this way. The Database Administrator didn’t know what they should configure them as so he left them alone. I find this is the case with a lot of options. For the most part leaving the defaults can be just fine, but other leaving others cause leave you missing out on some good features.

Let’s start from top to bottom.

Default backup media retention in days. Now the first things that comes to my mind is that “hey this is a cleanup job” SCORE! Thinking that maybe this will auto delete old backups. After all isn’t that what retention means? NOPE, not in this case.

In this case it’s just a number of days before that a backup media can be OVERWRITTEN. If the DBA goes to overwrite the media before those days it will give a warning message. You’ll note in every back up action you do the RETAINDAYS option is filled in. In this case it will always reflect to 90 now that we have changed it. In general, this a pointless option to me. I don’t normally OVERWRITE backup media. To me this was more relevant when Tapes were used and disk were harder to come by, so I leave it alone.

TSQL

 

GUI

Compress backup. This one is exactly what you think it is, no guessing here. Backup Compression is one I highly recommend changing from the default. Compression is a HUGE topic I will save for another time. But in short, the smaller the files the less space it takes up, less data stored means IO (and less data sent to your backup device) and therefore your databases back up and restore faster. Here is a great MDSN link to learn more about the benefits of backup compression. Backup compression is included in all editions of SQL Server since 2008 R2, so use it!

Recovery Interval (in minutes). Now this one I always thought meant Recovery Point Objective, in other words how much data am I willing to lose in minutes. I am partially right. According to MSDN, this option defines an upper limit on the time recovering a database should take. The SQL Server Database Engine uses the value specified for this option to determine approximately how often to issue automatic checkpoints on a given database.

This is an option I don’t change. I have yet to see a scenario where I want to override when SQL Server does a check point on the database by default.  There are times when I want to force a check point but it’s not something I am going to set a standard for. The only reason I have heard was to reduce IO on a data drive, but to me that’s at too high of a cost.

TSQL

So, there you have it, three more options that may not be a mystery for some any longer.

Quick Model Database Tidbit

Are you using your Model Database to its full potential?

I am finding more and more that Database Admins are not using the Model database to its fullest potential and some not at all.

What is that Model Database for?

The model database is basically the default setup (template) for all other databases created on a SQL Server instance. All databases created after install will inherit the properties of this database.

Why Configure It?

Using the model can insure consistency within your environment and is a quick way to automate your database setups. Below is a list of things I’ve used in my environments and others.

Top (in no particular order) Settings I have Implemented Through Model

  • Default Growth Settings
  • Query Store Settings
  • Recovery Models
  • Read Committed Snapshot Isolation
  • Allow Snapshot Isolation
  • Auto Update Statistics Asynchronously
  • Compatibility Levels

Now there are some things that databases will NOT inherit from the model, some of these I learned the hard way.

  • File Groups
  • CDC (Change Data Capture)
  • Collations
  • Database Owner
  • Encryption

Scripts to turn these options on

What Other Things Can You Do?

Now, you can go above and beyond just the database properties. You can add tables, views, triggers, functions etc. to your model database and every time a new database is created those objects will also exist. Why is this useful? In the past, I’ve used this for tracking my DDL (data definition language) changes. I created a trigger that would insert into a table the user, object, date and time, text snippet of any ALTER\DROP\CREATE statement that was run on a database. For it to work, the trigger needed to exist on all databases.

Final Words

We all know each environment is different, so don’t just go and implement everything, tailor it to your needs. I suggest you take a look at yours and see if there is anything you can adjust. You may be surprised on what you can tweak.

Note:

*In testing this, I have found that if you create a new database using CREATE DATABASE with T-SQL the Auto-Growth sizes do not get inherited by new database, but everything else did. If I create new database using GUI these setting do propagate.  Not sure if this is by design or a bug.

Synchronous VS Asynchronous Statistics Updates

One of the things I’ve been able to implement to help with performance is changing from Update Statistics Synchronous to Auto Update Statistics Asynchronously. It’s a simple change that can have a big impact when implemented in highly transactional OLTP environments. Notice I said OLTP not OLAP, since data in an OLAP environment tends to not be as dynamic, so it’s rare to enable this in a data warehouse.

So, what’s the difference between the two and why does it help?

Synchronous (defaulted as AUTO_UPDATE_STATISTICS =TRUE)

By default, when Auto Update Statistics is set to True, the SQL Server Query Optimizer will automatically update statistics when data has met a threshold of changes (insert, update, delete, or merge) and the estimated rows are now potentially stale. When statistics are stale, execution plans can become suboptimal which can lead to degradation in performance.

This best practice option ensures your statistics stay up to date as much as possible. Each time a cached query plan is executed the Optimizer checks for data changes and potentially generates new statistics. This behavior is exactly what we want, but there is a catch. The caveat to this is that a cached query plan will be “held” while the statistics are updated and will recompile to use the new values before running. This caveat can slow down the execution process dramatically.

Auto Update Statistics Asynchronously (AUTO_UPDATE_STATISTICS_ASYNC =TRUE)

This option does the same thing as the above, but with one significant difference. It allows the Optimizer to run a query and then use the updated the statistics. Where this option differs from synchronous is that a query will NOT be “held” while the statistics are updated. Queries can run “as is” until the query optimizer completes the statistics updates and then the query will recompile to begin to use them the next time it runs.

Confused Yet, so now in English. 

When the Asynchronous setting is set the query will run like it is until all statistics its uses are up-to-date, then it will run with the new numbers. It does not have to wait for all the new numbers to be updated to run. That’s where you get your performance boost, by not having to wait.

Check your settings using TSQL on ALL Databases

How to Turn it on TSQL

GUI

Under Database Properties > Options

NOTE: To enable this option Auto Update Statistics must be left ON.

Last Words

Remember every environment is different be sure to test this before implementing into production. A simple change from synchronous to asynchronous can make a difference.  It is definitely something to add to your performance tuning tool belt.

Hmmm… What’s This?

OK So, I am doing some digging and peaking around again in SQL Server and came across a database option called Date Correlation Optimization Enabled = False. Honestly, I had no clue what it did, so I took it as a learning opportunity to look into it and do a little research. Who knows, it may actual help me solve one of the many problems I run into day to day for clients.

Syntax

So, What Does It Do?

According to MSDN – The DATE_CORRELATION_OPTIMIZATION database SET option improves the performance of queries that perform an equi-join between two tables whose date or datetime columns are correlated, and which specify a date restriction in the query predicate.

How many of you read what MSDN says and thinks “wuuuuuttt, English please”? I do.

In English

Basically, it uses a foreign key relationship key between tables in SQL Server to enhance performance of date and date time queries when the dates fall within a certain defined range of each other (correlates). Ok that’s cool, but what’s the big deal? The power really comes in for things like reporting, validation, and data warehouses. With this option turned on, SQL Server maintains statistics between correlated columns and creates improved execution plans that reads less data.

Let’s See It in Action

Consider this, all internet orders that are received have a must send out by date (due date) of 10 days after order is received. Therefore, the OrderDate and DueDate are correlated, related to each other.

Here is a query you would normally run.

Without DATE_CORRELATION_OPTIMIZATION turn on the optimizer would create a plan just like anything else, however with it set to ON the optimizer can make more granular execution plans.

Here’s how

With each INSERT, UPDATE and DELETE between these two tables SQL Server is gathering statistics which helps the optimizer infer the query to be more like the one below. This is where the power comes in. The optimizer can better narrow down the records it needs to read and therefor returning faster results.

Here is the way SQL interprets the dates now that correlation is turned on and statistics are being gathered. It based on those statistics it can now infer that each DueDate is exactly 10 days after the OrderDate.

Depending on the number of records in the two tables this can be a VERY significant decrease to execution times.

The Caution

You should not enable DATE_CORRELATION_OPTIMIZATION in update-intensive database environments. SQL Server keeps all the correlation information in statistics form, this means with every INSERT, UPDATE and DELETE you gain additional overhead.

As always, be sure to test it before you use it in production.

What does this little check box do?

Ever wander around SQL Server properties and wonder what these little check boxes turn on? I do, and I get very tempted to check them. Here is one of those tempting little boxes that seems pretty handy, Use query governor to prevent long running queries.

Syntax

How Does it Work?

It’s simple. This option, available in SQL Server 2008 standard and forward, will prevent long running queries based on run time measured in seconds. If I specify a value of 180 the query governor will not allow any execution of a query that it estimates will exceed that value. Notice it says ESTIMATES which means it will be based on optimizer estimates and not ACTUAL run times. It does NOT KILL an actively running query after designated amount of time.  There is no worries for rollback scenarios or partial data.

CAUTION

This is an advanced option, keep in mind this is a server instance wide option. This will also effect your maintenance queries, so please use with caution, this is not “a let me check this box for fun” option.

But Wait There’s More

Now there is a query “transaction” based option available to us that will limit a specific query. This option will estimate a transaction and prevent it from running if it will go over the boundary we have set. Notice we set the limit before the query and then back to 0 after.

Again, playing with any old check box is not a recommended practice. Make sure you research it first and understand the full impact before checking that tempting little box.

Initial SQL Server Configurations

Wonder if I Do Things Differently?

I am always wondering what other DBA’s do and if I am doing things differently. One such thing is my initial server setups, basically, what I configure for each of my new servers. So, why not blog about it and see what others chime in with after they read this. Keeping in mind that everyone has different requirements and different ways that they like to do the actual configurations.

For now, I am not going to go into what each one of these configurations do and why I choose the value I do. That’s for another time. If you want that information you can always go to Books Online or just Google it. In this case, I am just going to give you a running list with scripts; that I’ve added too over the years based on best practices and experience.

How Does Yours Compare?

I’d really love to hear what others do and if I may have missed something that you like to implement that may benefit me as well.  So leave a comment, tweet to me, or send me an email let’s compare notes.

The List

So here are the basics setups I do on every server post install in no particular order.

* Value varies based on server configuration

  1. Min and Max Memory *

  1. Enable and Configure Database Mail ( This is only to enable, full script will be in later post)

  1. Set Default Database Locations

  1. Set SQL Agent Job History Retentions

  1. Set Cost threshold for Parallelism *

  1. Set Max Degree of a Parallelism *

  1. Set Optimize for Adhoc work loads

  1. Change Number of Error Logs

  1. Create Cycle Error Log Job

  1. Add Additional TempDB Files All With Same Size and Growth Rates *

  1. Set Media Retention

  1. Set Backup Compression Default On

  1. Change to Audit Successful and Failed Logins

  1. Set Default Growths in Model Not Be Percentages

  1. Set AD Hoc Distributed Queries off 
  2. Set CLR Enabled off 
  3. Set Ole Automation Procedures off 
  4. Set Scan For Startup Procs off 
  5. Set xp_cmdshell off
  6. Setup Operators

  1. Set Up Alerts 17-25 and Error codes  823,824,825 (Remember to add the alerts to the operator)

Note: Most of these can be set using GUI as well as the scripts above. Also, in addition to these configurations, I make sure that the server is brought up to the most current stable CU or Service Pack. Everyone’s environment is different, my list may not be right for you.