What does this little check box do?

Ever wander around SQL Server properties and wonder what these little check boxes turn on? I do, and I get very tempted to check them. Here is one of those tempting little boxes that seems pretty handy, Use query governor to prevent long running queries.

Syntax

How Does it Work?

It’s simple. This option, available in SQL Server 2008 standard and forward, will prevent long running queries based on run time measured in seconds. If I specify a value of 180 the query governor will not allow any execution of a query that it estimates will exceed that value. Notice it says ESTIMATES which means it will be based on optimizer estimates and not ACTUAL run times. It does NOT KILL an actively running query after designated amount of time.  There is no worries for rollback scenarios or partial data.

CAUTION

This is an advanced option, keep in mind this is a server instance wide option. This will also effect your maintenance queries, so please use with caution, this is not “a let me check this box for fun” option.

But Wait There’s More

Now there is a query “transaction” based option available to us that will limit a specific query. This option will estimate a transaction and prevent it from running if it will go over the boundary we have set. Notice we set the limit before the query and then back to 0 after.

Again, playing with any old check box is not a recommended practice. Make sure you research it first and understand the full impact before checking that tempting little box.

Please Don’t Do This!

Please, please, please Admins do not leave your default index fill factor at 0. This means you are telling SQL Server to fill the page 100% full when creating indexes. This also means you are forcing it to a new page when additional inserts are done. These are called PAGE SPLITS which can take time to perform and is a resource intensive operation. Having a high fill factor will cause more index fragmentation, decrease performance and increase IO.

If you find that this is how your system is configured, all is not lost. You can correct this by changing the default value so that new indexes will be created with proper factor and rebuilding your existing indexes with another fill factor value. I like to use 80 across the board for most, of course there is always the “it depends” scenario that arises but 80 is a pretty safe bet. One of those “it depends” would be on logging table that has the correct clustering key and never gets updates in between values (make sense?), I don’t want a fill factor of 80.  I’d want 0/100 to maximize page density as page splits wouldn’t occur if the clustered key is monotonically increasing.

Note, having the additional 20% free on a page will increase your storage requirements but the benefit outweighs the cost.

Example syntax for changing the default

Example script for rebuilding in index with new fill factor

 

System-Versioned Temporal Tables

Every once in a while, I like to take a moment and learn something new about the latest SQL Server gizmos and gadgets. Today I came across system-versioned temporal tables and it peeked my interest, so I figured I’d investigate and share my finding with you.

How many of you have need to track data changes over time? I’ve needed this many times for things like auditing, investigating data changes, data fixes, and trend analysis of values over time. Having to do this is the past has been a very daunting task at times and sometimes nearly impossible. This is where system-versioned temporal tables will really help out. They have given us a new way to do just that with a new user table type. It keeps a full history of those data changes and gives us a way query it order to do point in time analysis. What I really like about this is that you can’t INSERT or UPDATE data into the datetime columns as they are automatically generated with the insert, which is great for auditing.

Syntax for Temporal Table Creation

Note we now have 2 required datetime2 fields that will be populated with our temporal history data for each row.

Lets insert some records using INSERTS and see how the data looks.

This great illustration from Microsoft shows just how the history is tracked. For each INSERTED record, the SysStartTime will be populated in our BeginDate field. Each additional UPDATE/DELETE/MERGE our current record is copied to a history table and EndDate is updated with SysEndTime.

How do you query it?

It uses a new clause FOR SYSTEM_TIME that you can now query using it combined with AS OF, FROM TO, BETWEEN AND, CONTAINED IN, ALL

Results

Note the current record has end date of  9999-12-31 23:59:59 because its the current state of the record. It hasn’t been modified yet so it gets the default future datetime.

If you are lucky enough to be using SQL Server 2017 I highly recommend playing around with this new gadget, it may be of significant use to you.

VLFs the Forgotten Foe

How many of you check the amount of Virtual Log Files (VLFs) your transaction logs have?

Working as a consultant now, I see this as something that is often ignored by DBAs.  This is an easy thing maintain and yet so many don’t know how to. Keeping these in check can give you a performance boost not only on startup but with your insert/update/delete as well as backup/restore operations. SQL Server performs better with a smaller number of right sized virtual log files.  I highly recommend you add this to your server reviews.

What is a VLF?

Every transaction log is composed of smaller segments called virtual log files. Every time a growth event occurs new segments, virtual log files, are created at the end of your transaction log file. A large number of VLFs can slow things down.

What causes High VLFs?

As transactions force growth of the log file, inappropriate log file sizing or auto-growth settings can cause a high number of VLFs to occur.  Each growth event adds VLFs to the log file.  The more often you grow in conjunction with smaller growth segments, the more VLFs your transaction log will have.

Example

If you grow your log by the default 1 MB you may end up with thousands of VLFs as opposed to growing by 1GB increments. MSDN does a great job on explaining how a transaction logs work for a deeper dive I recommend reading it.

How do I know how many VLFs my log files have?

It’s very easy to figure out how many VLFs you have in your log file.

Make sure you are on the context of the database you want to run it against. In this case TEMPDB and run the DBCC LOGINFO command.

The query will return a result set of all LSNs created for that database, the COUNT of those rows is the amount of VLFs you have.

Now there are many ways you can get fancy with it using TSQL, so have fun with it. Write something that rolls through all your databases and gives you the record counts for each. There are plenty of useful examples on the internet.

The VLF counts should be under 100 ideally, anything above should be addressed.

*New for 2017 is a DMV that will give you an even easier way to get the VLF counts sys.dm_db_log_stats ( database_id ) .

How do you Fix?

These transaction log files should be shrunk until there are only two VLFs, then grown in chunks back to the current size.

  • Perform Shrink using DBCC SHRINKFILE

  • Regrow your log in an increment that makes sense to your environment. However, if your file growth is in excess 8GB it is recommended to grow in 8000MB chunks while manually regrowing the file. Your autogrowth should be set to a lower value. There is no set rule to what those values should be, it may take trial and error to figure out what is best for your environment.

Note: Growing out you log can cause a performance hit and block on going transactions, be sure to perform this during a maintenance window.

It’s that simple, now go take a look at your files. You may be surprise on what you find.

TIL: Microsoft Azure Part 2

Last week I started a multi-part series on Today I Learned (TIL) about Microsoft Azure.  This is part two of what I am learning in Azure.

Today’s topic is simply about Tenants, Subscriptions, Subscription Roles, Resource Groups, and Tags.

It’s Always Good to Start with Pictures

Here is a glimpse of how these topics relate. I will define and explain each below.

What is a Tenant?

In simplest terms, a Tenant is container for multiple subscriptions. An example of two subscriptions would be Azure and Office 365. They would be owned by one account, an individual or a company. A very large enterprise may use multiple subscriptions to better manage billing between divisions.

What Are Azure Subscriptions?

Basically, it’s just an ownership account. Think of it as just creating a billing and usage management account, whether it is a personal subscription or an enterprise level. The account allows you to group and manage multiple subscriptions for billing and reporting.

A subscription can encompass a mix IaaS, PaaS and SaaS services.  All subscription management, reviewing billing reports, and creating new subscriptions can be done through http://account.windowsazure.com site, but you need to be an account administrator.

How Do I Get Subscriptions?

You can get them through a Trial, MSDN, Pay as you go using a credit card, Azure Resellers (called Cloud Solution Providers or CSPs) or Enterprise Agreements.

What are the Subscription Server Roles?

Microsoft offers roles based on “Least Privilege” within Azure at the subscription level. There are several roles that secure the access to your cloud environment. These three main accounts below are all very powerful accounts and should be limited to only a few.

The top role is the Account Administrator. Think of this account in terms of what Enterprise Administrator is in your on-premises Active Directory. The Account Administrator has full rights. They have access to the account’s full financials and billing information for all subscriptions within the account, they can also create, delete and modify subscriptions.

The next role is the Service Administrator. This role is like the Domain Admin. It’s one level down from the account administrator and has full rights to the services in the subscription. They can do everything an account administrator can do with few exceptions, such as viewing the billing details of the subscription.

There is also the role of a Subscription or Co-administrator. This role is like System Admin(SA) in SQL Server.  This role can create and delete resources within the subscription but has no control over billing or the ability to change the authentication source such as AD.

The three accounts above control the Role Based Access (RBAC) for the rest of the users accounts on a resource level. They can assign users or groups of users, the rights to manage only the resources they need for their particular roles. These are roles such as Owner, Contributor and Reader of a resource group.

What’s a Resource Group?

A resource group is a container that separates resources into groups. Things that can exist in this container are things like VMs, NICS, Storage, Web Apps, SQL and Virtual Networks (VNETS). The “objects” within a resource group can be created, updated, and deleted as a group. One easy example of a resource group can be a development environment, all parts associated to that environment are contained in that in resource group.

What is a Tag?

The next granular level of organizing are Tags. These allow for adding your own meta-data to objects in Azure. Think of these as labels or categories for reporting and organizing things like billing. For instance, if the resource groups within an ERP environment are tagged as “ERP”, then those resource groups would get categorized together for management purposes. If you’ve ever used extended properties in SQL Server this is the same basic concept. There are however limits to the amount of tags an individual resource can have, which is currently 15. Your Azure billing statement is grouped by tags, which makes this almost a mandatory feature.

Summary

In this part we covered Tenants, Subscriptions, Subscription Roles, Resource Groups, and Tags. Hopefully you got a basic understanding of each and how the relate to each other. Next, I will dive a little into the differences between Azure SQL Database and SQL Server on IaaS.

 

TIL: Microsoft Azure Part 1

I thought maybe it would be a good idea to start a multi-part series on Today I Learned (TIL) about Microsoft Azure. As part of my new job I am currently learning as much about Azure as possible. As I learn things, I will blog to share what I am learning. It will cover beginner level things initially and gradually progress to more advanced topics.

Today’s topic is simply…. What the heck is Azure, how do I get to it, and what is the difference between IaaS, PaaS, and SaaS?

What is Azure?

According to Microsoft. “Microsoft Azure is a growing collection of integrated cloud services that developers and IT professionals use to build, deploy, and manage applications through our global network of data centers. With Azure, you get the freedom to build and deploy wherever you want, using the tools, applications, and frameworks of your choice.”

How do I get started in Azure Portal?

MS has a great walk through you can do to get you started. There is a free 30-day trial you can utilize to play around with along with $200 in Azure credits. I highly recommend getting an account and clicking through everything just to get the feel of all the offerings it has.

http://account.windowsazure.com

What is the difference between IaaS, PaaS, and SaaS?

You may have heard or seen the acronyms IaaS, PaaS and SaaS. Well what are they? Let’s start with their definitions and then how it pertains to SQL Server.

What is IaaS? (HOSTING)

Infrastructure as a Service or IaaS – Microsoft provides infrastructure capabilities such as an operating system, storage and network connectivity in a cloud offering. Basically, it’s the same as you would have on Premises, Virtual Machines and all its requirements to run your applications. You are able to install software such as SQL Server (aka SQL Server in IaaS) and configure as needed. They host your applications and workloads just as you normally have used, only difference is that it is in the cloud (their data centers). This is very similar to the concept of using a co-location facility (CoLo) data center to store your servers, only with a lot more automation and features. One of the biggest benefits being that you do not have to maintain the underlying hardware or data center.

It’s like asking a Network\Storage administrator to setup a virtual machine for you and you can decide on all the requirements you want. Such as I need 5 drives with X amount of storage on certain types of disks, and this many CPUS.

What is PaaS? (BUILD)

Platform as a Service or PaaS – This is the next level they offer in which you do not have control over the infrastructure and don’t install the software. That is all chosen (standardized) for you based on your “tier” requirements and the platform you need, such as SQL Server (aka Azure SQL Database) or MySQL/Postgres. I will cover more on these services in a follow up post. .

I think of PaaS as when you ask a Network\Storage administrator to give you a box to install SQL on and they give you a Templated VM with all it parts configured including SQL Server already installed. MS offers many different PaaS services – including Cloud Services, Websites, Storage and Azure SQL Database.

What is SasS? (CONSUME)

Software as a Service or SaaS – This simply put are things like Office 365. It’s applications that are consumed in the cloud, no hardware or software is maintained by the company. You just pay for the service and log in to the software essentially.

Summary

So, in Part 1, we’ve covered the basics of what IaaS, PaaS and SaaS means and how they can be leveraged. Next I will cover subscriptions and roles.  As I learn things I will continue to drop little tidbits like this, look for them over the next few weeks.

Among Giants

Since becoming a Database Administrator I’ve always looked at Microsoft MVP’s as the giants in our field.  I never once thought I could be among them. I am very humbled to be recognized as a Microsoft Data Platform MVP for 2017. Thank you to those that deemed me worthy enough to nominate me.

What is an MVP?

According to Microsoft, the MVP Award is an annual award that recognizes exceptional technology community leaders worldwide who actively share their high quality, real world expertise with users and Microsoft. Microsoft MVPs represent a highly select group of experts. MVPs share a deep commitment to community and a willingness to help others.

How did I get here?

There is no magic formula to becoming an MVP.  I blog, I tweet, I did a couple podcasts, I speak at SQL Saturdays, I run my local chapter, and I am a Regional Mentor, but that doesn’t mean you have to do the same.  The point is I try to give back to that community what they have given to me. That’s all it takes. I share what I know and just do my thing, somehow that worked and you can do it too.

What can you do to help others achieve this?

NOMINATE, NOMINATE, NOMINATE!  There are so many valuable members in the community that have not become an MVP simply because they have never been nominated. I’ve had some tell me that they thought I was already an MVP so never thought to do so. I think for a lot of us waiting in the wings for our chance this is the case. So take the time and nominate someone you deem worthy, whether you think they are already one or not.

Here is the link to do so.

https://www.mvp.microsoft.com/en-us/Nomination/NominateAnMvp

Thank You

While I said it above, thank you again to all of those who nominated and believed in me. I could not have done it without the support of the #SQLFamily all these years. I’m honored to be a Microsoft MVP.

Lone DBA Podcast

I recently had the pleasure of being a guest on a Podcast episode with the SQL Data Partners Carlos Chacon (B|T) and Steve Stedman (B|T).  If you haven’t had a chance to attend one of my sessions on Survival Tips for the Lone DBA, this is great insight into it. I share via questions and answers how it is to be a Lone DBA.

http://sqldatapartners.com/2017/03/28/episode-89-lone-dba/

Hide and Group Columns in SSRS Using a Parameter

Ever had users come to you and request another version of a report just to add another field and group data differently? Today, was such the day for me. I really don’t like have multiple versions of the same report out there. So, I got a little fancy with the current version of the report and added a parameter then used expressions to group the data differently and hide columns. For those new to SSRS I’ve embedded some links to MSDN to help you along the way.

Current Report

The report gives summarized counts by invoice date.  It currently has a ROW group using date_invoiced and the detail row is hidden from user.

current-report

row-group-2

group-exp3

New Version

To complete the user request to have Item Codes and Descriptions added to the report I need to find a way to group the data by Item and show Item columns without disturbing the current report that is currently used by many consumers.

To Do:

  • Add Parameter
  • Set Available Values
  • Set Default Values
  • Add New Columns
  • Change Visibility
  • Change Grouping to group data using parameter

Step 1: Add Parameter

add-para-4

 Step 2: Set Available Values

add-values-5

Step 3: Set Default Values – I want to make sure my current users get their version of the report simply, so I set it to No (N).

add-default-6

Step 4: Next Add Columns.  I was lucky that the fields (Item Code, Item Desc) the user requested to be add was already part of the dataset used, so no additional coding was needed on the stored procedure.

add-fields-7

Step 5: Next change the Visibility attributes. You want to HIDE the column when the IncludeItemDetails parameter is NOT YES (Y). I did this for both item columns.

visibility-8

visibility-9

Step 6: Next I needed to change the grouping. The report is currently group by date_invoiced only. To make the data now total by Item I need to group it by Item only when the IncludeItemDetails parameter is Yes (Y). I did this using an IIF expression setting it to IF IncludeItemDetails=Y then group using field value else don’t (0). Again I did this for both fields.

grouping-10

expression-11

espression-12

You will see it’s relatively simple to do, and prevents a whole new report version from being created. For you beginners out there, it’s a very easy way to start to minimize the number of reports you have to maintain. Try it.

 

 

Challenge Accepted

My life for the last 2 years has been a constant battle of putting out fires with system performance; finally user complaints have moved getting this resolved as my top priority.

Let’s see how I tackled the problem…

Symptoms:rubix4

  • Very High Disk Latency as high as 300,000 milliseconds (ms) is not unusual
  • Average: 900 – 15,000ms
  • Memory Pressure
  • Slow User Experience

Problem:

  • Bad hardware
  • Over-provisioned VM Hosts (what happens on one VM effects the other)
  • Old NetApp SAN
  • No infrastructure budget for new hardware

Challenge: Make the system viable with no hardware changes or tweaks

Step 1: Brain Storming (in no particular order)

  • Reduce I/O
    • I can probably tune a ton of old stored procedures
    • I need to do a full review of all indexes
  • Reduce blocking
  • Investigate daily data loads
    • How is the data loaded?
    • Can it be improved?

rubx3Step 2: Reduce I/O & Investigate daily data loads

After doing some research, it was found that we were truncating 48 tables daily with over 120 million records as part of our morning load. The process was taking over 2 hours to complete each morning and would often cause blocking. During this time users would run reports and complain data would not return in a timely manner. So I thought maybe this would be a great place to start.

I also noticed we were loading 8 tables to keep them “real time for reports” once every hour.  This resulted in a total of 9.6 million records being truncated and subsequently reloaded, taking approximately 17 minutes of every hour.

Solution: Implement transactional replication instead of doing hourly and morning truncate and reloading of tables.

Outcome: Once implemented the disk I/O dropped drastically and disk latency reduced to an average 200ms. The morning load times dropped from 2 hours to 9 minutes and the hourly load went to 5 seconds down from 17 minutes. Now, the disk latency is not optimal still but better. Best practices say it should be below 20ms.

This solution was difficult to accomplish because of all the work that went into it. Once the replicated tables were stable, I first identified which stored procedures were utilizing those tables (I used Idera’s SQL Search for this). Then I changed each procedure to read tables from new location.

Next, I had to change any SSRS reports that had hard coded calls to those old tables (Note: don’t do this. Always use a stored procedure). Finally, I looked for any views that called the tables and adjusted those as well.

In two weeks’ time, over 500 stored procedures, reports and views were manually changed.

It is probably worth noting that this was all done in Production simply because we do not have a test environment for this system.  Yes, I did get a few bumps and bruises for missing a few table calls in store procedures or typo’s or nasty collation errors that arose.  These were bound to happen and some changes I was not able to test during the day.  All in all it went really well. Having a test environment would have alleviated these, but not all of us have the luxury.

rubix2

The OOPS: Unfortunately, not long after I implemented the first couple of tables I began to notice blocking. When I investigated I found it to be replication. I forgot a very important step, which thanks to a blog post by Kendra Little I was able to quickly identify and solve. I needed to turn on Allow Snapshot Isolation and Is Read Committed Snapshot On. Her blog was a HUGE help. You can read at her blog all the details as to why this is important here: http://www.littlekendra.com/2016/02/18/how-to-choose-rcsi-snapshot-isolation-levels/ . Once those to options were implemented the replication ran seamlessly and the blocking disappeared.

Step 3: Index Review

First of all, I always preach as a Lone DBA don’t waste your time reinventing the wheel, use what is out there. So I turned to the trusted scripts from Glenn Berry (B|T). You can find them here: https://sqlserverperformance.wordpress.com/2016/06/08/sql-server-diagnostic-information-queries-for-june-2016/ . I am not going to supply snippets of his code, feel free to down load them directly from his site to review.

I started by reviewing duplicate indexes and deleted\adjusted accordingly where needed. Then I went on to looking for missing indexes (where some magic happens). This reduced the amount of I/O because it lessened the amount records that had to be read due to using proper indexing.

Now just because these scripts stated they were missing I didn’t just create them; I evaluated their usefulness and determined if they were worth the extra storage space and overhead. Glenn’s script gives you a lot of information to help decide on the index effectiveness. As you can see with the first one in the result set, if the index was added over 45,000 user seeks would have utilized it and query cost would drop on average by 98.43%.  Again I didn’t arbitrarily add this index because it was in the list.  Once I determined I would not be creating a duplicate or similar index on the table and given the potential of better performance with the suggested index, it was added.

index

Oh one more OOPS…(why not, learn from my mistakes)

After going thru the indexes exercise and adding indexes to the tables (in the subscriber), I lost all of them minus the Primary keys. Yep, made one change to a replicated table and the replication reinitialized; all my indexes were dropped. Needless to say I was not a happy camper that day. Lucky for me each index I added was scripted and put into a help desk ticket. I was able to go back thru all my tickets and resurrect each index I needed. Now, to be smart, I have scripted all of them and place those into one file, so I can re add them all if needed in future. I haven’t found a way around this yet, so if anyone has any information on how to feel free to let me know.

Step 4: Performance Tune Slow Stored Procedures (the fun part for me)

Armed with Grand Fritchey’s (B|T) book on Execution plans for reference I began tuning any stored procedure I was aware of that was taking more than 2 minutes to run. In total, I tuned about 77 of them, most were report related or part of data loads. I found many benefited from indexes being placed on temp tables within the procedures. Others were doing too many reads based on bad WHERE clauses or joins.

Another thing I ran across was functions used in where clauses or joins. Example of which is date conversion functions that were converting both From and To Dates used a BETWEEN statement. The functions caused each date value to be processed by the function before being evaluated by the WHERE clause, causing many more reads then necessary. To work around this I read in the data and converted the dates into temp table, then did my JOINS and WHERES on the already converted data. Alternatively, depending on what the statement was I also converted the value and placed in variable for later evaluation.

There were so many more things I came a crossed and tuned such as implicit conversions, table spools, and sorts that were not optimal. All of these were fixed by little code changes. I am not going into all of that because this post would be quite long, but you get the point.

Happy Side Effects: After cleaning up the tables and implementing replication I actually free up 300 GB of storage and greatly reduced our backup and restore times.rubix1

Summary:

Things are running much better now; introducing Replication reduced enough disk I/O to keep the system viable. For now latency now hovers on average between 2 and 200 milliseconds, which is a vast improvement. I do, however, still see spikes in the thousands of milliseconds and users still complain of slowness when they run large ad-hoc queries within the application (JDE Edwards E1). Unfortunately, that goes back to hardware and the application itself which are things that I cannot improve upon.  The good news is, I am hearing a rumor that we will be installing a Simplivity solution soon. I am very excited to hear that. I’ll blog again once that solution is in place and let you know how that goes.