How to get started with Always Encrypted for Beginners Part 3: One Two Punch

A few weeks ago, I wrote Part 1 and 2 of this series, which was a beginner’s guide to Always Encrypted. In part 3, I am going to go over what road blocks I ran into when trying to implement this solution for a client. The goal of the project was to prevent the DBA from being able to view salary information, while still allowing the application to function without issue.

We were able to encrypt the data easily, but the entire process was not without issues. We had to remove default constraint values to get it to implemented. Always Encrypted does not currently support constraints. Fortunately, the defaulted zero value they had in place was easily fixed and we were off and running. Select statements from SSMS were returning encrypted values, and the application was able to read the encrypted values and returned the real values.

The real challenges started when the client began to test their application code. The first thing we hit was triggers.

The table had several insert triggers associated with the columns that were now encrypted. Since the data was now encrypted the insert triggers would fail. Again, we lucked out and they were able to recode somethings in order to remove the triggers. Of course, since troubles always come in threes, this was no different. First the constraint problem, then the triggers, then we hit the biggest road block that halted our Always Encrypted implementation.

SELECT SUM(Salary) as TotalSalary from Table

Always encrypted does not currently support aggregations. Unfortunately, this is salary data and the code is full of aggregations. There was no way around, and it was not feasible at this point to move the aggregations to the application side. Sadly, we had to abandon this solution and purpose another route. The good news is that this issue will be fixed in the next version of SQL Server. You can learn about secure enclaves technology here.

In researching different possible roadblocks of Always Encrypted I came across as great blog post by Aaron Bertrand (B|T) where he lists several gotcha’s, if you are looking to implement Always Encrypted, I highly recommend starting by reading his post to discover what kind of issues you may run into.

Additionally, here is the official list directly from MSDN of what Is not supported.

Always Encrypted is not supported for the columns with the below characteristics (for example, the Encrypted WITH clause cannot be used in CREATE TABLE/ALTER TABLE for a column, if any of the following conditions apply to the column)

  • Columns using one of the following datatypes: xml, timestamp/rowversion, image, ntext, text, sql_variant, hierarchyid, geography, geometry, alias, user defined-types.
  • FILESTREAM columns
  • Columns with the IDENTITY property
  • Columns with ROWGUIDCOL property
  • String (varchar, char, etc.) columns with non-bin2 collations
  • Columns that are keys for nonclustered indices using a randomized encrypted column as a key column (deterministic encrypted columns are fine)
  • Columns that are keys for clustered indices using a randomized encrypted column as a key column (deterministic encrypted columns are fine)
  • Columns that are keys for fulltext indices containing encrypted columns both randomized and deterministic
  • Columns referenced by computed columns (when the expression does unsupported operations for Always Encrypted)
  • Sparse column set
  • Columns that are referenced by statistics
  • Columns using alias type
  • Partitioning columns
  • Columns with default constraints
  • Columns referenced by unique constraints when using randomized encryption (deterministic encryption is supported)
  • Primary key columns when using randomized encryption (deterministic encryption is supported)
  • Referencing columns in foreign key constraints when using randomized encryption or when using deterministic encryption, if the referenced and referencing columns use different keys or algorithms
  • Columns referenced by check constraints
  • Columns in tables that use change data capture
  • Primary key columns on tables that have change tracking
  • Columns that are masked (using Dynamic Data Masking)
  • Columns in Stretch Database tables. (Tables with columns encrypted with Always Encrypted can be enabled for Stretch.)
  • Columns in external (PolyBase) tables (note: using external tables and tables with encrypted columns in the same query is supported)
  • Table-valued parameters targeting encrypted columns are not supported.

The following clauses cannot be used for encrypted columns:

  • FOR JSON PATH
  • FOR XML

Although I did take a few punches with this implementation I think Always Encrypted is a great choice to keep your data secure, and I will definitely use it in the future. I am looking forward to seeing how this option improves in the next versions of SQL Server.

****Securing your environment is important no matter what option you implement. Learn more about database security. Register for the Denny Cherry and Associates webcast on Friday January 19th, 2018 where we talk about Database Security including Spectre and Meltdown.****

It’s All in the Name, Index Naming Conventions

Awhile back, if you are on Twitter, you can probably recall my ranting about the 949 indexes I was reviewing. The process was to determine duplicate indexes and consolidate them or discard the unneeded ones. My ranting was not about the duplicates per se it was about the index names. It only takes a second to name an object with some name that tells what the thing is. Below I will show you some examples and give you an easy script that will help you generate your index names. Taking a little time to name things appropriately can go a long way, it can not only be time saving but can help to reduce redundancy.

The DONT’s

As you can see from above, none of the names gave a complete indication of what the index encompassed. The first one did indicate it was a Non Clustered Index so that was good, but includes the date which to me is not needed. At least I knew it was not a Clustered Index. The second index did note it is a “Covering Index” which gave some indication that many columns could be included I also know it was created with the Data Tuning Advisor due to the dta prefix. The third index was also created with dta but it was left with the default dta naming convention. Like the first one I know the date it was created but instead of the word Cover2, I know there are 16 key columns are noted by the “K#” and regular numbers tell me those are included columns. However, I still have no idea exactly what these numbers denote without looking deeper into the index. The last index is closer to what I like however the name only tells me one column name when in fact it encompasses five key columns and two included columns.

The DO’s

Above we see a few good examples with varying naming conventions, but each tell me a much more than what we saw in the “Donts” list. The first one I know right away is a non-clustered index with two fields. The second is a clustered index with one field. The third is an index that has 9 fields, probably a covering index of some sort, which tells me that it is probably important to a specific query or procedure. Index four uses the name of the table and the field, which does give me more information but given the name of indexes are limited to 128 characters I prefer to leave that out. The last one closer to one of my favorites, because it does give more information. The name lets us know that it has an included column of Birthdate.

The Script

Here is the script I use when creating indexes. It will go thru and identify a missing index and create Index statement using a standard name convention.
NOTE: This modified version of what we use at DCAC is for just showing you how I include and create a standard statement in my code, this is not to be used to identify missing indexes, as it is not the purpose of my post. I have removed pieces of that from this script.

Create Statement Output

This statement gives the proper database context and create statement syntax, it adds all the needed key columns within the () and separated by commas. In addition, it adds the word INCLUDE and encompasses the included columns also in () and comma separated. Note the index name only includes the Key columns, which is just my preference.

Summary

Now everyone has their own naming conventions. You do you, however should stay consistent and give some meaning to it. When others look at the objects we should be able to know what it’s doing or be given a good clue as to what it’s for. This not only helps to quickly identity its definition but also keep you from creating duplicates. By looking at names you can tell with columns you need are already included in other indexes. Naturally you can’t just trust the name you have to dig deeper while examining your indexes but it at least will give you a realistic starting point.

How to Get Started with Always Encrypted for Beginners Part 2

In this post we will pick up where we left off in Part 1, if you haven’t read that please go back and do so.

Now that we have encrypted our columns, it’s time to take a look at how we decrypt them inside SQL Server Management Studio or through our applications. You’ll be surprised to see how easy it is.

Verify Your Setup

First, let’s verify that the table is still encrypted, and nothing changed after you ran through the Part 1 examples. To confirm, simply query sys.columns, script out the table, or query the data to check that the Birthdate column is still encrypted.

You can also just SELECT and look at the data. Here you see the encrypted values for the data in the birthdate column.

Check system tables

Decrypt with SQL Server Management Studio

Viewing decrypted data within SQL Server Management Studio (SSMS) is very easy. SSMS uses .NET 4.6 and the modern SQL Server client, so you can pass in the necessary encryption options. SSMS uses the connection string to access the Master Key and return the data in its decrypted format.

First create a new SQL Connection and Click Options to expand the window.

Then go to the Additional Connections Parameters Tab of the login window and simply type column encryption setting = enabled. Then choose Connect.

Now try SELECT From your columns.

If you did it correctly you will see the decrypted BirthDate column.

Now the reason this works is both the Column Key and Master Key are stored in the Windows Certificate Store of this SQL Server. The Master Key was setup in Part 1 in the Windows Certificate Store.

Decrypt with an Application

According to MSDN for the application to decrypt data the account that connects to the database must have the VIEW ANY COLUMN MASTER KEY DEFINITION and VIEW ANY COLUMN ENCRYPTION KEY DEFINITION database permissions. These permissions are required to access the metadata about Always Encrypted keys in the database.

Once those permissions are established all you must do is change your application connection string to include Column Encryption Setting=enabled. Below is an example using SQL Server integrated security.

Example

string connectionString = “Data Source=server63; Initial Catalog=Clinic; Integrated Security=true; Column Encryption Setting=enabled”; SqlConnection connection = new SqlConnection(connectionString);

 Summary

Decrypting the data when you have the Master Key stored on your Database Server makes it easy, but it also gives access to the encrypted data to the DBA. Make sure when you are planning to use Always Encrypted you consider who you want to have access to the data and where you want to store the keys.  There are many more layers of security you can add to this by defining those items. The example I gave in both Part 1 and Part 2 are the least complex and therefore not the most secure, but it gives you a beginner’s overview of how implement it. You need to examine your application to understand if it fits with in the current supported features of always encrypted.

How to get started with Always Encrypted for Beginners Part 1

Encryption has always been intriguing to me but seemed like it could be a very complex process to set up. However, SQL Server has made it very simple when they introduced Always Encrypted (AE) into SQL Server 2016 and Azure SQL Database. Unlike Transparent Data Encryption (TDE) which only encrypts data files and backups at rest, AE is configured on a column level and not database level. Additionally, Always Encrypted is available in Standard (and Express) Edition, starting with SQL Server 2016 SP1.  You can easily encrypt a social security number (SSN) which is considered very sensitive within the United States or Salary column in a table with just a few clicks. In past versions of SQL Server, you could use cell-level encryption (CLE) perform this, but it required code changes and the keys were stored in the database, and the data was sent to the application unencrypted. Which brings us to the other benefit of AE, which is that DBAs can no longer see the unencrypted values of the data, as they could with CLE, because the column encryption key is stored outside of SQL Server.

Let’s see how you do it and walk through what each of these options means.

Using Adventure Work 2016 CTP3 HumanResources.Employee Table we are going to encrypt the Birthdate column.

Start by Right Clicking on the Table > Choose Encrypt Columns

It brings up a Wizard, one of the two recommend ways to configure AE. The other option is to use PowerShell.

Click Next on the Intro Screen

You will note in the example below, that it lists the columns and then shows the encryption STATE, which indicates if the column is eligible for encryption. There are several unsupported column characteristics that may make it so a column cannot be encrypted. This link to MSDN describes this in further detail. The items on this list are unsupported because they have a default constraint or a check constraint defined:

This is just an example of one of the road blocks you may encounter. So, let’s take a step back and setup an example we can easily use.

The Setup

Run the below to create a copy of the Employee table. We are doing this to make a table without any constraints.

Now again, Right Click on the Table > Choose Encrypt Columns

In this case, the column we want is BirthDate, so I place a check next to it. To continue I need to Choose a Type of Encryption.

There are two possibilities Deterministic and Randomized.

MSDN defines Deterministic encryption as always generates the same encrypted value for any given plain text value. Which means that if you have a birthdate of 01/03/1958 it will always be encrypted with the same value each time such as ABCACBACB. This allows you to index it, use it in WHERE clauses, GROUP BY and JOINS.

Randomized encryption per MSDN- uses a method that encrypts data in a less predictable manner. This makes Randomized encryption more secure, because using the example above each encrypted value of 01/03/1958 will be different. It could be ABCACBACB, BBBCCAA, or CCCAAABBB. All three encrypted values are subsequently decrypted to the same value. Since the encrypted value is random you cannot perform search operations etc. as you can with Deterministic.

In most cases, you will want to use deterministic encryption. The places where random encryption makes sense is where you have a low range of distinct values. An attacker might be able to determine what the encrypted value was by brute force attacking using a variety of parameters. Some examples of this data include birth date, blood type, or credit card verification numbers (CVV).

So, going back to our example, select deterministic from the drop down.

The next step is to choose an Encryption Key. Let’s choose CEKAUTO (NEW). This stands for Column Encryption Key. You can use the same Key for every column or choose a new one for each.

Then click NEXT

Every Encryption Key must have a MASTER KEY. This is the value that is used to protect the other column keys. In the below we are going to just go with the defaults. If you have already generated a master key in you SQL Server instance, you can choose to reuse it for any new column you add.

One of the most complex parts of encryption is determining where to store these keys and who will have access to it.  You can store these keys on a client machine using a Windows Certificate store or in Azure Key store.

The next screen has a great feature and kudos to Microsoft for this add-in. You can choose to generate a PowerShell Script, so you can rerun this again, or store in your source control.

After clicking NEXT, you’re done. The wizard will create all the keys, and encrypt the selected columns.

Now if you SELECT from the table you will see the values in Birthdate are now encrypted.

Key Management in Windows Certificate Store

If you would like to see where the keys are stored within Windows, you can do so by doing the below. Go to Microsoft Management Console (type MMC your run bar Win+R). Then go to File, then Add/Remove Snap In. Certificates will be the third one down, click Add.

If you scroll back up you will note the when we created our Master Key it did so under CURRENT USER so choose My user account.

Expand Personal and Click Certificates (Key)

So, there you have it. Encryption made easy. This is only the tip of the iceberg. You need to understand how your environment will access and decrypt the data, encrypting is only part of the puzzle. I will cover how to get SSMS to decrypt the data in Part 2, in the meantime play around with it.

Thankful DBA

This week is Thanksgiving in the United States, so I thought it fitting to write a quick blog on what I am thankful for as a DBA. These are in no particular order and feel free to respond with something you are thankful for. I’d love to hear it.

  1. Glenn Berry’s Diagnostic Scripts- (B|T) Used these for years. Really a great set of scripts and explanations that we all should be grateful for.
  2. Ola Hallengren’s (BMaintenance scripts. Index Optimization, Backup, and Integrity Checks for all! They have become an industry standard and continue to get better and better.
  3. RCSI (Read Committed Snapshot Isolation) –My Readers can stop blocking Writers! Thanks to Kendra Little (B|T) for this great blog.
  4. SSMS Results to grid and copy with header- I do this a million and one times a day. Ctrl+Shift+C .
  5. Query Store – Having the plan run stats and being able to force a plan, LOVE IT! Thanks Conor Cunningham and Microsoft for that one.
  6. Availability Groups – Easy setup and trustworthy. And, well, I like the name better than Mirroring.
  7. DMV’s (Dynamic Management Views)- Show me the money! It has all the SQL Server Internals goodies, mine for the taking.
  8. Profiler– #ProfilerForLife nuff said, my most trusted friend.
  9. Columnstore Indexes – I feel the need, the need for speed! Who doesn’t like up to 10x Query Performance gains and 10x the data compression?
  10. Paul Randal’s Waits Library (B|T)– I can’t tell you how many times I’ve referred to this. So much useful information!
  11. Adam Machanic’s SP_whosisactive (B|T) – This is my GO TO, for seeing what’s actively going on, it’s the first thing I run.
  12. Sentry One Plan Explorer– Execution Plans on STEROIDS! Yes, please. Love the detail and ease of use.
  13. RedGate’s SQL Prompt- My coding is downright ugly. With a quick Ctrl+K, Ctrl+Y my code is sleek and readable. Not to mention I love the code snippets.
  14. Grant Fritchey’s (B|T) Execution Plans book- I can’t wait for 3rd Edition, someone took my very loved highlighted, tabbed, marked up copy. I need another!
  15. Power BI – It puts the slicing and dicing into the user’s hands, giving Management easy visualizations of their data for analysis. Less reports for me to write, yippie.  Thank you Microsoft.
  16. dbatools – Great Power Shell Modules for migrating databases. No more doing it the hard way.

Last and most importantly I am grateful for #SQLFamily, Bloggers, and Twitter. I learn from you every damn day!

Happy Thanksgiving!

~Monica

Do Not Pass GO!

What is the GO statement and why is it so important to use? When do I have to use it? When do I not use it? These are questions that have passed through my head from time to time while writing T-SQL within SQL Server.

First What Is It and When Should I Use It?  

The GO statement lets SSMS (the interface) know when it’s the end of the batch. It basically defines the scope of what you are trying to send to the Database Engine. The below example sends two separate statements. The first statement changes the database context to run the next statement under, followed by the execution of the SELECT running against the database Demo. Simple, yes.

Example

Gotcha’s

I’ve been caught out by this behavior in the past. Using GO in stored procedures can be tricky. There are times when you want to run a batch of statements together, but if you put a GO into the procedure and compile it you will notice that you lost any code that came after the GO. The GO signaled to that my ALTER or CREATE Procedure statement was done. It then ignored all the statement below it as part of the stored procedure.

Another Gotcha which can be both good and bad depending on your need. A Variable’s life span ends after each GO statement. If you declare a variable, run a statement to populate that variable and use that variable you can no longer use it once you send a GO.

Example

Cool things to do with GO

This is learned by chance just messing round. Did you know that if you put a number after GO it will run those statements that many times? This can be handy for generating a lot of load against a database for demos.

Don’t like the word go, change it. Yep you can change it to anything you want. Tool> Options> Query Execution

Change it to RUNNOW.

Let’s Try

HMMM Why didn’t that work… because I ran it in an existing Open Window (Session).  Let’s try that again.

TADA! Much better.

Now that you know what it does, feel free to advance to GO and collect your $200. Enjoy.

What Are These Backup Settings All About?

I ran across a client the other day that had these Backup and Recovery options set like the picture below because it is defaulted this way. The Database Administrator didn’t know what they should configure them as so he left them alone. I find this is the case with a lot of options. For the most part leaving the defaults can be just fine, but other leaving others cause leave you missing out on some good features.

Let’s start from top to bottom.

Default backup media retention in days. Now the first things that comes to my mind is that “hey this is a cleanup job” SCORE! Thinking that maybe this will auto delete old backups. After all isn’t that what retention means? NOPE, not in this case.

In this case it’s just a number of days before that a backup media can be OVERWRITTEN. If the DBA goes to overwrite the media before those days it will give a warning message. You’ll note in every back up action you do the RETAINDAYS option is filled in. In this case it will always reflect to 90 now that we have changed it. In general, this a pointless option to me. I don’t normally OVERWRITE backup media. To me this was more relevant when Tapes were used and disk were harder to come by, so I leave it alone.

TSQL

 

GUI

Compress backup. This one is exactly what you think it is, no guessing here. Backup Compression is one I highly recommend changing from the default. Compression is a HUGE topic I will save for another time. But in short, the smaller the files the less space it takes up, less data stored means IO (and less data sent to your backup device) and therefore your databases back up and restore faster. Here is a great MDSN link to learn more about the benefits of backup compression. Backup compression is included in all editions of SQL Server since 2008 R2, so use it!

Recovery Interval (in minutes). Now this one I always thought meant Recovery Point Objective, in other words how much data am I willing to lose in minutes. I am partially right. According to MSDN, this option defines an upper limit on the time recovering a database should take. The SQL Server Database Engine uses the value specified for this option to determine approximately how often to issue automatic checkpoints on a given database.

This is an option I don’t change. I have yet to see a scenario where I want to override when SQL Server does a check point on the database by default.  There are times when I want to force a check point but it’s not something I am going to set a standard for. The only reason I have heard was to reduce IO on a data drive, but to me that’s at too high of a cost.

TSQL

So, there you have it, three more options that may not be a mystery for some any longer.

Quick Model Database Tidbit

Are you using your Model Database to its full potential?

I am finding more and more that Database Admins are not using the Model database to its fullest potential and some not at all.

What is that Model Database for?

The model database is basically the default setup (template) for all other databases created on a SQL Server instance. All databases created after install will inherit the properties of this database.

Why Configure It?

Using the model can insure consistency within your environment and is a quick way to automate your database setups. Below is a list of things I’ve used in my environments and others.

Top (in no particular order) Settings I have Implemented Through Model

  • Default Growth Settings
  • Query Store Settings
  • Recovery Models
  • Read Committed Snapshot Isolation
  • Allow Snapshot Isolation
  • Auto Update Statistics Asynchronously
  • Compatibility Levels

Now there are some things that databases will NOT inherit from the model, some of these I learned the hard way.

  • File Groups
  • CDC (Change Data Capture)
  • Collations
  • Database Owner
  • Encryption

Scripts to turn these options on

What Other Things Can You Do?

Now, you can go above and beyond just the database properties. You can add tables, views, triggers, functions etc. to your model database and every time a new database is created those objects will also exist. Why is this useful? In the past, I’ve used this for tracking my DDL (data definition language) changes. I created a trigger that would insert into a table the user, object, date and time, text snippet of any ALTER\DROP\CREATE statement that was run on a database. For it to work, the trigger needed to exist on all databases.

Final Words

We all know each environment is different, so don’t just go and implement everything, tailor it to your needs. I suggest you take a look at yours and see if there is anything you can adjust. You may be surprised on what you can tweak.

Note:

*In testing this, I have found that if you create a new database using CREATE DATABASE with T-SQL the Auto-Growth sizes do not get inherited by new database, but everything else did. If I create new database using GUI these setting do propagate.  Not sure if this is by design or a bug.

Synchronous VS Asynchronous Statistics Updates

One of the things I’ve been able to implement to help with performance is changing from Update Statistics Synchronous to Auto Update Statistics Asynchronously. It’s a simple change that can have a big impact when implemented in highly transactional OLTP environments. Notice I said OLTP not OLAP, since data in an OLAP environment tends to not be as dynamic, so it’s rare to enable this in a data warehouse.

So, what’s the difference between the two and why does it help?

Synchronous (defaulted as AUTO_UPDATE_STATISTICS =TRUE)

By default, when Auto Update Statistics is set to True, the SQL Server Query Optimizer will automatically update statistics when data has met a threshold of changes (insert, update, delete, or merge) and the estimated rows are now potentially stale. When statistics are stale, execution plans can become suboptimal which can lead to degradation in performance.

This best practice option ensures your statistics stay up to date as much as possible. Each time a cached query plan is executed the Optimizer checks for data changes and potentially generates new statistics. This behavior is exactly what we want, but there is a catch. The caveat to this is that a cached query plan will be “held” while the statistics are updated and will recompile to use the new values before running. This caveat can slow down the execution process dramatically.

Auto Update Statistics Asynchronously (AUTO_UPDATE_STATISTICS_ASYNC =TRUE)

This option does the same thing as the above, but with one significant difference. It allows the Optimizer to run a query and then use the updated the statistics. Where this option differs from synchronous is that a query will NOT be “held” while the statistics are updated. Queries can run “as is” until the query optimizer completes the statistics updates and then the query will recompile to begin to use them the next time it runs.

Confused Yet, so now in English. 

When the Asynchronous setting is set the query will run like it is until all statistics its uses are up-to-date, then it will run with the new numbers. It does not have to wait for all the new numbers to be updated to run. That’s where you get your performance boost, by not having to wait.

Check your settings using TSQL on ALL Databases

How to Turn it on TSQL

GUI

Under Database Properties > Options

NOTE: To enable this option Auto Update Statistics must be left ON.

Last Words

Remember every environment is different be sure to test this before implementing into production. A simple change from synchronous to asynchronous can make a difference.  It is definitely something to add to your performance tuning tool belt.

Does Your Code Have a Preamble?

Okay, here is a pet peeve of mine, I think every stored procedure, function, view etc. should all contain a block of code I refer to as a preamble. If yours doesn’t I strongly recommend you start adding it. It drives me crazy when I see code with no documentation of any kind telling me what it is for and when it was written or changed.

Why? A preamble documents the use, need, and changes for the code. It also leaves bread crumbs as to how why and what you did. I don’t know about you but I may code something and not have to change it for two years. When I do, I then think back and say why did I do that or who changed this code last. Working as a lone DBA, leaving bread crumbs was critical as I constantly jumped from task to task.

Above is the example of my preamble I use for all code I write. It tells who wrote it, what it is, what it is called by, how to run it, and lists any changes done to it.  I find one of the most helpful items on this is the Run documentation.  Here I place an exact run statement. It will show how the parameters should look and gives me a quick way to test it.

There are a million and one reasons as to why you should be doing this in your code. If you’re not doing it just take a second and start doing it. You’ll thank me for it later.