Does Your Code Have a Preamble?

Okay, here is a pet peeve of mine, I think every stored procedure, function, view etc. should all contain a block of code I refer to as a preamble. If yours doesn’t I strongly recommend you start adding it. It drives me crazy when I see code with no documentation of any kind telling me what it is for and when it was written or changed.

Why? A preamble documents the use, need, and changes for the code. It also leaves bread crumbs as to how why and what you did. I don’t know about you but I may code something and not have to change it for two years. When I do, I then think back and say why did I do that or who changed this code last. Working as a lone DBA, leaving bread crumbs was critical as I constantly jumped from task to task.

Above is the example of my preamble I use for all code I write. It tells who wrote it, what it is, what it is called by, how to run it, and lists any changes done to it.  I find one of the most helpful items on this is the Run documentation.  Here I place an exact run statement. It will show how the parameters should look and gives me a quick way to test it.

There are a million and one reasons as to why you should be doing this in your code. If you’re not doing it just take a second and start doing it. You’ll thank me for it later.

Why I Go to Summit Each Year?

This year will be my 6th PASS Summit that I will have attended. Some people have asked me why I still go, and what I get out of Summit that I don’t get from attending and speaking at SQL Saturdays. That’s an easy one for me to answer, but a long answer at that.

Networking

First and foremost, it’s for the networking. Getting to meetup with so many other like minded people is gratifying. This networking allows you to exchange ideas, war stories, and downright geek out with others that know what you are talking about and don’t have their eyes glaze over when discussing optimizer internals.

On top of all that it’s career building. You never know when you will have that one conversation with someone that leads to your next career step. It’s the meeting of the #SQLFamily face to face that can sometimes lead to your name being brought up about a new position that may be opening.  Believe me, I know of many people that have landed their next opportunity just from the interactions they have had at Summit.

Sessions

Learning, learning, learning is the name of the game. The ever-changing world of data happens so fast these days and I want to keep up. It is impossible in every environment to have a chance to get exposure to all the facets of SQL Server and other Data related topics. Summit gives us a wider view into what’s out there, and provides ideas of things I may be able to implement.

Vendors

I love gadgets and gizmos. Learning about what products are out there to make me more efficient at my job is always something I look forward too. Tools were essential for me when I was a lone DBA so I always looked forward to visiting each booth and see what’s new with them. Spending a lot of time on the show floor is a good investment of time–you get to meet folks who work for the vendors who support the community, and the tools and offerings they have.

CAT Team

Did you know the Microsoft Customary Advisory Team (CAT) team is onsite and willing to answer all your questions FOR FREE? Yep, no $300 for support to just pick up the phone, they are there in person. I’ve asked them a question each year pertaining to an environment I was working in, in which they were able to help me solve. The CAT team isn’t support–they work with the biggest workloads SQL Server supports and have seen almost everything.

Inspiration and Renewal

I always find a sense I renewal after attending Summit. I get a little kick start on new things to do at work that makes my job even more exciting. Many people get into rut or routine and their jobs can become mundane. Getting out of the office and seeing others that have the same passion for data as you can really help.

Lastly, Friendships

Summit is a #SQLFamily reunion of sorts. Many of us converse daily on twitter and only get to see each other in person once a year. Summit brings us all together. Some of us are lucky and are able to meet up with each other more frequently as we travel to SQL Saturdays and other events, but Summit is kinda like a home base for us. We all look forward to those #SQLHugs, catching up, and meeting new family members.

This year’s Summit will be a little different for me. When I am not getting my learn on, you will find me in the Denny Cherry & Associates (DCAC) Vendor Booth. Yep, this time I get to play booth babe. I can’t wait to be on the other side of the table telling other attendees what DCAC is all about and helping talk through some of their performance issues.

Anyhow this maybe my 6th PASS Summit but it won’t be my last. I encourage all who have never been or have gone before to go. It’s not too late to register, if you use discount code in the image you can save $150. There is so much you can get out of it each and every time to attend.

See you there!

SQL Sequence vs Identity Column

Let’s take a look at what a Sequence is in relation to an Identity Column in SQL Server. Did you know Sequence even existed? I didn’t until I was asked about them. It’s amazing how much you can skip over and never notice in SSMS. See this little folder, ever notice it under Programmability in Management Studio. Yep it’s there, SQL Server has this very handy thing called Sequences. Sequences are a relatively new feature that have only existed since SQL Server 2012, but have long existed in Oracle (where there a no identity columns).

What is a Sequence?

Per MSDN, A sequence is a user-defined schema bound object that generates a sequence of numeric values according to the specification with which the sequence was created. The sequence of numeric values is generated in an ascending or descending order at a defined interval and can be configured to restart (cycle) when exhausted. It’s important to note the sequences can be cached and are not guaranteed to be in sequential order.

The Code

After creation, you can look at the properties in the GUI. Note you can set the Increment by, you can restart the sequence and even set min and max values.

How to Query

What’s an Identity Column?

A property of a table that is set by initial seed value (starting value). For each insert it assigns a new incremental value that is added to the identity value of the previous row that was loaded.

The Code

Note: after the field type IDENTITY, you declare the SEED (1), then INCREMENT Value (1). You can see this in the GUI below for the Column properties.

How to Query

Let’s insert two records and see the NameID Identity column increment.

Comparing the two

Attribute Sequence Identity
Object Level Database Table
Limit Can set a limit Limited by data type INT vs BIG INT
Values Generated by application call using NEXT VALUE FOR Generated on INSERT on a table
Increments Declared as INCREMENT at setup and can be anything. Can be a negative number to cause the sequence to descend instead of ascending numbers Declared as INCREMENT at setup and can be any positive number, numbers will ascend
Scope Generated outside the scope of a transaction Generated within a Transaction
Number Assignment Sequences can be preallocated (example assign me number 1-25) Cannot be preallocated, assigned in order by INSERT
Gaps Can experience Gaps Can experience Gaps
Uniqueness No, this number can be reset and reused. Often used as Primary Key (you must choose this property to ensure the unique value).

Summary

So, this was just a quick look what a Sequence is compared to an Identity column. Both can be very useful. If you’re looking for a unique value your best bet it to go with an Identity Column and the Primary Key option. If you want just an auto-generated value to be able to use in an application outside of a table a Sequence is a sure bet. Play around with it, I am sure you can come up with a million and one uses for each.

***UPDATE NEW to SQL 2017 ***

Per MSDN there is a new option

IDENTITY_CACHE = { ON | OFF }

Applies to: SQL Server 2017 and Azure SQL Database (feature is in public preview)

Enables or disables identity cache at the database level. The default is ON. Identity caching is used to improve INSERT performance on tables with Identity columns. To avoid gaps in the values of the Identity column in cases where the server restarts unexpectedly or fails over to a secondary server, disable the IDENTITY_CACHE option. This option is similar to the existing SQL Server Trace Flag 272, except that it can be set at the database level rather than only at the server level.

Just Check ALL the Boxes

Today I ran into something on a client server I unfortunately see too often.  The DBA goes through the trouble of configuring and setting up alerts\operators but doesn’t really understand what the options in the configurations mean. So unfortunately, that means they take the CYA (cover your ass) approach and they check all of them. Now, not only have I seen this with alerts but also with things like security configurations as well. My advice is to always in to take a second and research what each option is before you check the little boxes, especially when it comes to security. Always follow the rule of less is more.

In the example below the administrator enabled alerts for an operator using the CYA approach. They checked email, pager, and netsend.

So, what’s the big deal? This server experienced an insufficient resources (space) alert that fired every minute and by having PAGER notifications enabled it caused the error log to bloat, consumed unnecessary space, and created noise in the logs.

The administrator of this environment really only needed to configure the email notification, as the company did not use netsend nor have pagers duties configured. To be honest, I have yet to see an environment use more than that, and per Microsoft both Pager and Net Send will be removed in future versions.

So, the morale of the story is, please take the time to research what the little checkboxes are before you enable them. The example above is a pretty benign one, but you can imagine what kind of messes you can get yourself in for other more critical things like security.

A Side Note:

If you want to learn how to setup your alerts and operators I’ve already written a blog on that with scripts you can find it here.

You can also visit github.com/dc-ac for a full install script that includes the Alert and Operator setups https://github.com/DC-AC/SQL2016_Scripted_Install

Hmmm… What’s This?

OK So, I am doing some digging and peaking around again in SQL Server and came across a database option called Date Correlation Optimization Enabled = False. Honestly, I had no clue what it did, so I took it as a learning opportunity to look into it and do a little research. Who knows, it may actual help me solve one of the many problems I run into day to day for clients.

Syntax

So, What Does It Do?

According to MSDN – The DATE_CORRELATION_OPTIMIZATION database SET option improves the performance of queries that perform an equi-join between two tables whose date or datetime columns are correlated, and which specify a date restriction in the query predicate.

How many of you read what MSDN says and thinks “wuuuuuttt, English please”? I do.

In English

Basically, it uses a foreign key relationship key between tables in SQL Server to enhance performance of date and date time queries when the dates fall within a certain defined range of each other (correlates). Ok that’s cool, but what’s the big deal? The power really comes in for things like reporting, validation, and data warehouses. With this option turned on, SQL Server maintains statistics between correlated columns and creates improved execution plans that reads less data.

Let’s See It in Action

Consider this, all internet orders that are received have a must send out by date (due date) of 10 days after order is received. Therefore, the OrderDate and DueDate are correlated, related to each other.

Here is a query you would normally run.

Without DATE_CORRELATION_OPTIMIZATION turn on the optimizer would create a plan just like anything else, however with it set to ON the optimizer can make more granular execution plans.

Here’s how

With each INSERT, UPDATE and DELETE between these two tables SQL Server is gathering statistics which helps the optimizer infer the query to be more like the one below. This is where the power comes in. The optimizer can better narrow down the records it needs to read and therefor returning faster results.

Here is the way SQL interprets the dates now that correlation is turned on and statistics are being gathered. It based on those statistics it can now infer that each DueDate is exactly 10 days after the OrderDate.

Depending on the number of records in the two tables this can be a VERY significant decrease to execution times.

The Caution

You should not enable DATE_CORRELATION_OPTIMIZATION in update-intensive database environments. SQL Server keeps all the correlation information in statistics form, this means with every INSERT, UPDATE and DELETE you gain additional overhead.

As always, be sure to test it before you use it in production.

What does this little check box do?

Ever wander around SQL Server properties and wonder what these little check boxes turn on? I do, and I get very tempted to check them. Here is one of those tempting little boxes that seems pretty handy, Use query governor to prevent long running queries.

Syntax

How Does it Work?

It’s simple. This option, available in SQL Server 2008 standard and forward, will prevent long running queries based on run time measured in seconds. If I specify a value of 180 the query governor will not allow any execution of a query that it estimates will exceed that value. Notice it says ESTIMATES which means it will be based on optimizer estimates and not ACTUAL run times. It does NOT KILL an actively running query after designated amount of time.  There is no worries for rollback scenarios or partial data.

CAUTION

This is an advanced option, keep in mind this is a server instance wide option. This will also effect your maintenance queries, so please use with caution, this is not “a let me check this box for fun” option.

But Wait There’s More

Now there is a query “transaction” based option available to us that will limit a specific query. This option will estimate a transaction and prevent it from running if it will go over the boundary we have set. Notice we set the limit before the query and then back to 0 after.

Again, playing with any old check box is not a recommended practice. Make sure you research it first and understand the full impact before checking that tempting little box.

Please Don’t Do This!

Please, please, please Admins do not leave your default index fill factor at 0. This means you are telling SQL Server to fill the page 100% full when creating indexes. This also means you are forcing it to a new page when additional inserts are done. These are called PAGE SPLITS which can take time to perform and is a resource intensive operation. Having a high fill factor will cause more index fragmentation, decrease performance and increase IO.

If you find that this is how your system is configured, all is not lost. You can correct this by changing the default value so that new indexes will be created with proper factor and rebuilding your existing indexes with another fill factor value. I like to use 80 across the board for most, of course there is always the “it depends” scenario that arises but 80 is a pretty safe bet. One of those “it depends” would be on logging table that has the correct clustering key and never gets updates in between values (make sense?), I don’t want a fill factor of 80.  I’d want 0/100 to maximize page density as page splits wouldn’t occur if the clustered key is monotonically increasing.

Note, having the additional 20% free on a page will increase your storage requirements but the benefit outweighs the cost.

Example syntax for changing the default

Example script for rebuilding in index with new fill factor

 

System-Versioned Temporal Tables

Every once in a while, I like to take a moment and learn something new about the latest SQL Server gizmos and gadgets. Today I came across system-versioned temporal tables and it peeked my interest, so I figured I’d investigate and share my finding with you.

How many of you have need to track data changes over time? I’ve needed this many times for things like auditing, investigating data changes, data fixes, and trend analysis of values over time. Having to do this is the past has been a very daunting task at times and sometimes nearly impossible. This is where system-versioned temporal tables will really help out. They have given us a new way to do just that with a new user table type. It keeps a full history of those data changes and gives us a way query it order to do point in time analysis. What I really like about this is that you can’t INSERT or UPDATE data into the datetime columns as they are automatically generated with the insert, which is great for auditing.

Syntax for Temporal Table Creation

Note we now have 2 required datetime2 fields that will be populated with our temporal history data for each row.

Lets insert some records using INSERTS and see how the data looks.

This great illustration from Microsoft shows just how the history is tracked. For each INSERTED record, the SysStartTime will be populated in our BeginDate field. Each additional UPDATE/DELETE/MERGE our current record is copied to a history table and EndDate is updated with SysEndTime.

How do you query it?

It uses a new clause FOR SYSTEM_TIME that you can now query using it combined with AS OF, FROM TO, BETWEEN AND, CONTAINED IN, ALL

Results

Note the current record has end date of  9999-12-31 23:59:59 because its the current state of the record. It hasn’t been modified yet so it gets the default future datetime.

If you are lucky enough to be using SQL Server 2017 I highly recommend playing around with this new gadget, it may be of significant use to you.

VLFs the Forgotten Foe

How many of you check the amount of Virtual Log Files (VLFs) your transaction logs have?

Working as a consultant now, I see this as something that is often ignored by DBAs.  This is an easy thing maintain and yet so many don’t know how to. Keeping these in check can give you a performance boost not only on startup but with your insert/update/delete as well as backup/restore operations. SQL Server performs better with a smaller number of right sized virtual log files.  I highly recommend you add this to your server reviews.

What is a VLF?

Every transaction log is composed of smaller segments called virtual log files. Every time a growth event occurs new segments, virtual log files, are created at the end of your transaction log file. A large number of VLFs can slow things down.

What causes High VLFs?

As transactions force growth of the log file, inappropriate log file sizing or auto-growth settings can cause a high number of VLFs to occur.  Each growth event adds VLFs to the log file.  The more often you grow in conjunction with smaller growth segments, the more VLFs your transaction log will have.

Example

If you grow your log by the default 1 MB you may end up with thousands of VLFs as opposed to growing by 1GB increments. MSDN does a great job on explaining how a transaction logs work for a deeper dive I recommend reading it.

How do I know how many VLFs my log files have?

It’s very easy to figure out how many VLFs you have in your log file.

Make sure you are on the context of the database you want to run it against. In this case TEMPDB and run the DBCC LOGINFO command.

The query will return a result set of all LSNs created for that database, the COUNT of those rows is the amount of VLFs you have.

Now there are many ways you can get fancy with it using TSQL, so have fun with it. Write something that rolls through all your databases and gives you the record counts for each. There are plenty of useful examples on the internet.

The VLF counts should be under 100 ideally, anything above should be addressed.

*New for 2017 is a DMV that will give you an even easier way to get the VLF counts sys.dm_db_log_stats ( database_id ) .

How do you Fix?

These transaction log files should be shrunk until there are only two VLFs, then grown in chunks back to the current size.

  • Perform Shrink using DBCC SHRINKFILE

  • Regrow your log in an increment that makes sense to your environment. However, if your file growth is in excess 8GB it is recommended to grow in 8000MB chunks while manually regrowing the file. Your autogrowth should be set to a lower value. There is no set rule to what those values should be, it may take trial and error to figure out what is best for your environment.

Note: Growing out you log can cause a performance hit and block on going transactions, be sure to perform this during a maintenance window.

It’s that simple, now go take a look at your files. You may be surprise on what you find.

TIL: Microsoft Azure Part 2

Last week I started a multi-part series on Today I Learned (TIL) about Microsoft Azure.  This is part two of what I am learning in Azure.

Today’s topic is simply about Tenants, Subscriptions, Subscription Roles, Resource Groups, and Tags.

It’s Always Good to Start with Pictures

Here is a glimpse of how these topics relate. I will define and explain each below.

What is a Tenant?

In simplest terms, a Tenant is container for multiple subscriptions. An example of two subscriptions would be Azure and Office 365. They would be owned by one account, an individual or a company. A very large enterprise may use multiple subscriptions to better manage billing between divisions.

What Are Azure Subscriptions?

Basically, it’s just an ownership account. Think of it as just creating a billing and usage management account, whether it is a personal subscription or an enterprise level. The account allows you to group and manage multiple subscriptions for billing and reporting.

A subscription can encompass a mix IaaS, PaaS and SaaS services.  All subscription management, reviewing billing reports, and creating new subscriptions can be done through http://account.windowsazure.com site, but you need to be an account administrator.

How Do I Get Subscriptions?

You can get them through a Trial, MSDN, Pay as you go using a credit card, Azure Resellers (called Cloud Solution Providers or CSPs) or Enterprise Agreements.

What are the Subscription Server Roles?

Microsoft offers roles based on “Least Privilege” within Azure at the subscription level. There are several roles that secure the access to your cloud environment. These three main accounts below are all very powerful accounts and should be limited to only a few.

The top role is the Account Administrator. Think of this account in terms of what Enterprise Administrator is in your on-premises Active Directory. The Account Administrator has full rights. They have access to the account’s full financials and billing information for all subscriptions within the account, they can also create, delete and modify subscriptions.

The next role is the Service Administrator. This role is like the Domain Admin. It’s one level down from the account administrator and has full rights to the services in the subscription. They can do everything an account administrator can do with few exceptions, such as viewing the billing details of the subscription.

There is also the role of a Subscription or Co-administrator. This role is like System Admin(SA) in SQL Server.  This role can create and delete resources within the subscription but has no control over billing or the ability to change the authentication source such as AD.

The three accounts above control the Role Based Access (RBAC) for the rest of the users accounts on a resource level. They can assign users or groups of users, the rights to manage only the resources they need for their particular roles. These are roles such as Owner, Contributor and Reader of a resource group.

What’s a Resource Group?

A resource group is a container that separates resources into groups. Things that can exist in this container are things like VMs, NICS, Storage, Web Apps, SQL and Virtual Networks (VNETS). The “objects” within a resource group can be created, updated, and deleted as a group. One easy example of a resource group can be a development environment, all parts associated to that environment are contained in that in resource group.

What is a Tag?

The next granular level of organizing are Tags. These allow for adding your own meta-data to objects in Azure. Think of these as labels or categories for reporting and organizing things like billing. For instance, if the resource groups within an ERP environment are tagged as “ERP”, then those resource groups would get categorized together for management purposes. If you’ve ever used extended properties in SQL Server this is the same basic concept. There are however limits to the amount of tags an individual resource can have, which is currently 15. Your Azure billing statement is grouped by tags, which makes this almost a mandatory feature.

Summary

In this part we covered Tenants, Subscriptions, Subscription Roles, Resource Groups, and Tags. Hopefully you got a basic understanding of each and how the relate to each other. Next, I will dive a little into the differences between Azure SQL Database and SQL Server on IaaS.