When I started with computers in the early 1980s, my personal computer used a cassette recorder to store data. This was the main option if you couldn’t afford a mainframe computer, which most businesses couldn’t. Enterprise Readiness meant that you maybe could get up and running again within a few days if your computer broke or your tapes/disks were damaged.
What requirements do we have of Enterprise Ready database platforms? I’d suggest four areas:
How has SQL Server supported these requirements over time?
SQL Server 2000
I think SQL Server 2000 was the first version where we could really talk about Enterprise Readiness. Let’s look at the news from an Enterprise Readiness perspective.
We all know that computers break. It could be network cards, power supplies, etc. Windows 2000 introduced support for WSFC/MSCS (Windows Server Failover Clustering), which means that on the Operating System level it can handle automatic failover when a server goes down. Basically you have an active node and a number of passive nodes pointing to the same disk. This required a shared disk, and the only kind of shared disks supported was a SAN (Storage Area Network). If the SAN broke down – well then you did have a big problem.
Another High Availability feature of SQL Server 2000 was Replication. It allows multiple databases to synchronise tables. It could even synchronise SQL Server with completely different databases (for example Oracle). The synchronisation could be done using some different methods, including transactional replication – which meant it used the transaction logs to continously track changes, or snapshot replication – which meant it periodically compared the data. However, Replication doesn’t come with any kind of automatic failover handling.
Log Shipping is a simple, yet useful technology. By taking regular backups of the transaction log of a database and replaying them on a second database (by restoring them), the second database will be synchronized with the main database. If you for example take backups every 10 minutes, you will never lose more than 10 minutes of data. This is useful for Disaster Recovery scenarios. Also the delay can be an advantage because if you do something really bad with the main database, you can potentially restore objects from the second database.
In 2003, the SQL Server 2000 64-bit edition was released. Going from 32 to 64 bits means you can handle much larger addressable memory and is crucial for large databases. While 32 bits means you have an addressable memory of about 4 GB, 64 bits means about 16 million Terrabytes in limit. Since there is still no database in the world larger than about 1 million Terrabytes (as far as I know), this “ought to be enough” for a few yeare more.
This feature was not new to SQL Server 2000, but it’s so important that I want to include it here. The term ACID was coined in the early 1980s as an acronym for Atomicity, Consistency, Isolation, Durability. This means that even if the power is shut down while updating the database, even if multiple applications write at the same time and even if you try to write something invalid to the database, it will handle all of this and still be reliable. This is supported by all modern relational databases, and this is the main reason that SQL Server has a transaction log (containing undo and redo records). The guarantee is that once an operation has been confirmed (which means it has been written to the transaction log), you can always trust that it will have updated the database. This is obviously crucial for example in a banking application where you must know that transactions suddenly doesn’t disappear, and this is one of the main reasons many applications use databases instead of just storing data directly in files.
SQL Server 2005
The 2005 version was another very big step in SQL Server history, especially on the Business Intelligence and Data Warehouse side. However, in this blog post I will cover only the traditional database features.
As an option to Failover Clustering, Microsoft introduced Database Mirroring. The main difference is that each server stores it’s data locally, so there is no dependency on a SAN. It can operate in both High Protection mode (with zero data loss) and High Performance mode (with asynchronous data transfer between the principal and mirror server). Another advantage compared to Failover Clustering is that the database on the mirror server can be used (in readonly-mode though). However there are limitations also with this technology, mostly that it cannot guarantee that multiple databases are “failed-over” at the same time.
Something that has much increased the reliability of SQL Server databases was the introduction of a checksum Control of the data pages. This means that the SQL Server can detect pages that have been corrupted. Damaged databases can often be repaired if the damage is found early.
Instant File Initialization
This is a simple but major performance improvement. When a data file or the transaction log grows, you can configure so it doesn’t any longer have to fill the new space with zeroes. Filling the empty space with zeroes can take long time and could lead to long unresponsive times (minutes) if the database needed to autogrow.
Rows can be larger than 8000 bytes
In SQL Server 2000, each row of data had to fit in a single 8 KByte page. While I don’t recommend designing tables with huge rows, it is a big improvement that this limitation was removed in 2005. This definitely can make things easier for application developers.
T-SQL language improvements
The T-SQL language was very much improved in 2005. My favorite new features were TRY-CATCH blocks and CTEs (Common Table Expression, which allows recursion in SQL). Also recompilation of stored procedures became more efficient with statement-level recompiles instead of recompiling the whole stored procedure.
Often you have no possibilities to do any changes to SQL queries, for example if you are running an application that you haven’t developed yourself. In 2005, SQL Server introduced something called Plan Guides, where you can tell it that when it recognizes a certain SQL Query (or runs a certain stored procedure) it should modify it’s Execution Plan. The result is that you can do advanced performance tuning of SQL queries that come from 3rd party providers where you normally have no chance to edit the queries.
Optimistic concurrency support
An additional isolation level, SNAPSHOT, was introduced that does not use any write locks. This can solve problems with deadlocks and applications which spend most of their time waiting because of locking issues.
SQL Server 2005 opened up new possibilities to diagnose the server and databases on a more detailed level through something called Dynamic Management Views and Dynamic Management Functions. These make it possible to check things like fragmentation, memory usage and statistics on how often an index is used. There are even DMVs that gives you suggestion on new indexes. These DMVs are very useful for performance tuning and troubleshooting.
Partitioning was another key feature that was introduced in SQL Server 2005. Large tables can benefit very much from being split into partitions. It can increase the performance as well as make it possible to spread your partitions on different disks. For instance you can put old data on a slower disk and new data on a faster disk. Also two applications can write to different partitions without disturbing each other (if you enable partition level locking).
Schemas are a great improvement in how to manage security. Since SQL Server 2005, you can create schemas that you group your objects under. You can then grant users access to schemas, which means they will get access to all objects belonging to that schema. It’s like managing access on folder-level in Windows instead of managing it on file-level.
Asymmetric keys and certificates
Another important new security feature was the possibility to use Certificates and Asymmetric Keys in SQL Server 2005. That makes it possible to do strong authentication and create/verify digital signatures.
Business Intelligence news
As I mentioned in the beginning, the biggest news in SQL Server 2005 was maybe for Business Intelligence and Data Warehouse users. Analysis Services (SSAS) and Integration Services (SSIS or formerly known as DTS – Data Transformation Services) were completely rewritten and deserve an Enterprise Readiness history of their own. Reporting Services was introduced. At the end of the day this meant that BI/DW developers, since SQL Server 2005, got a strong and broad BI platform. It meant SQL Server could serve many more Enterprise needs than before.
SQL Server 2008
SQL Server 2008 is seen by some people as not having many news compared to SQL Server 2005. However I think the SQL Server 2008 news are very important from a Enterprise Readiness perspective.
Database compression was introduced in SQL Server 2008. The impact of compression should not be understimated, since it not only reduced database sizes but also often increased performance because compressed data uses less disk I/O (the most common bottleneck). Two compression options were introduced: ROW which was the simplest and PAGE, which required more resources to do the compression, but also gave higher compression rates.
BLOBS (or binary large objects) is something that most database platforms support. They can be used to encapsulate pictures, documents or even video recordings. The Filestream feature in SQL Server 2008 allows BLOBS to be stored on a separate fileshare instead of within the database itself. This means that the BLOBS also can be accessed directly through the file system instead of through SQL Server. That can be a good performance benefit, by just enabling Filestream on tables with big BLOGS, but it also makes integrations easier in some cases.
When consolidating and virtualizing SQL Server environments, it became increasingly important to ensure that no single user or application could use up too much resources. Like in a mainframe system you want to be able to prioritize users and applications, and set limits to their usage of CPU and memory. The Resource Governor in SQL Server 2008 allows resource pools to be defined and users/applications to be tied to specific resource pools. Again, this is very important from an Enterprise Readiness perspective where you want to run many databases on the same server.
Policy Based Management
The policy based management features in SQL Server 2008 were introduced as a way to enforce policies, restrictions and behaviours on the SQL Servers. It can be applied to multiple servers and monitored from a single console. It uses a simple expression syntax and is very powerful in what you can control, which could be anything from naming standards to database options (so they cannot be modified by accidental DBAs).
Transparent Database Encryption
An additional security features in SQL Server 2008 was the addition of Transparent Database Encryption (TDE). It encrypts the whole database automatically (no need to do any changes to applications). This is useful for databases containing sensitive information and where it is important that not anyone having access to the backups or files can read the information. A symmetric encryption key is used, that is protected from unauthorized access by the operating system.
Powershell is a powerful scripting environment that enables IT professionals to write scripts that do administrative tasks. Since SQL Server 2008, it is also possible to do SQL Server administrative tasks from Powershell. Many database administrators use Powershell to automate their tasks.
Change Data Capture (CDC) and Change Tracking (CT) were introduced in SQL Server 2008 to be able to capture a history all changes done to database tables. This is very useful for applications and other databases (such as Data Warehouse) that can subscribe to this change history.
Extended Events is a way of monitoring what’s going on inside of SQL Server. There are about 200 different extended events available in SQL Server 2008, which give a detailed information level that’s never been available before. Microsoft also announced that it will retire the old SQL Profiler and that all monitoring in the future should use Extended Events instead.
Optimize for Ad Hoc Workloads
This is a small but significant feature that was introduced in SQL Server 2008. When enabled, SQL Server is more restrictive in adding new plans to the plan cache, which can save a lot of valuable memory in environments where mostly Ad Hoc SQL queries are being used.
SQL Server 2008 R2
Not so many new features were added in SQL Server 2008 R2. It was more focused on improving performance and improving existing features, such as for example better compression of Unicode strings. Also it was here that Microsoft started to talk about Self Service BI and developed the in-memory analytics Engine xVelocity.
Multi server management in Management Studio
A new feature in Management Studio was that you can run a SQL Server Query against many servers at the same time. This is done by creating a server group in the Registered Servers window and running the query against the group.
Hyper-V Live Migration support
In Windows 2008 R2, a new feature was Hyper-V Live Migration. It allowed virtual machines to be moved from one server to another without stopping them. SQL Server 2008 R2 added support for this feature, so you could move your virtual SQL Servers without rebooting.
SQL Server 2012
A new High Availability solution called AlwaysOn was launched in SQL Server 2012. It combines the best of Failover Clustering and Database Mirroring. You are no longer dependent on a SAN and you can manage databases in Availability Groups that are tied together. Also the number of false failovers and downtime during failover has been reduced heavily compared to earlier technologies. Servers can now be in different geographical locations (as opposed to failover clustering in earlier versions of SQL Server). Also up to 4 secondaries (of which 2 can be synchronous) can be available and they can be used for running queries. Even backups can be run on the secondaries.
ColumnStore Indexes are compressing the data column by column, and therefore can achieve very good compression ratios, especially for Data Warehouse fact tables. Another advantage is that queries that involve only a few columns will run much faster as SQL Server only needs to retrieve data for those columns and not the whole rows. The end results is that typical Data Warehouse workloads, or other workloads where you write seldom but read much, can get greatly improved performance.
SQL Server Data Tools
The purpose of SSDT (SQL Server Data Tools) is to give database developers equally good tools as application developers. SSDT is a plugin for Visual Studio that provides a new SQL Server development Environment including Refactoring, Unit Testing, Code Analysis, Source Control and Schema Compare, just to mention some of the benefits. The requirement for these features is that you have Visual Studio 2012 Professional or better.
Data-Tier Applications (DACPACs and BACPACs)
Deploying and upgrading SQL Server application databases have always been tricky and involved a lot of manual work. Now databases can be packaged into DACPACs (Data-Tier Applications), which are much easier to deploy. For instance you can autogenerate deployment scripts based on your existing database and a DACPAC. If you also want to include data, there’s the BACPAC. They work the same as a DACPAC, expect for the difference that they also include your data.
One earlier problem with SQL Server databases has been that they are dependent on server-level (or rather instance) objects. For instance before SQL Server 2012 you couldn’t create user accounts with passwords in a database. You had to create a login with a password on the server and then tie the database user to that login. Now with more objects contained in the databases it is easier to move them from one server to another.
User-defined server roles
In SQL Server 2012, new server roles with desired permission sets can be created and users assigned to them. That helps managing the rights of database administrators and other staff easier.
There has been a lot of improvements of SQL Server between 2000 and 2012. I’ve here highlighted the ones I think are most important from an Enterprise Readiness perspective, but there are many other very important improvements also to Business Intelligence users and Database Developers that I’m not covering in this blog post.