Azure VM SQL Server Disk Configuration
Proper disk configuration is one of the most important determining factors of SQL Server performance. When running SQL Server on Windows Azure VMs this is even more important because Windows Azure VMs run on shared infrastructure where performance can vary. The following are some guidelines to follow when setting up SQL Servers on Windows Azure VMs:
What types of disks can an Azure VM Have?
Azure virtual machines have three general categories of disks:
- A single operating system disk (C drive) which is backed by a .VHD file stored as page blobs in an Azure storage account.
- A single temporary local disk (D drive). This drive is backed by the local disks on the host that runs your virtual machine (VM). Any data on this disk could be lost after a reboot. It is used to store page files for the operating system as well as other temporary files required by the virtual machine.
- One or many data disks which are backed by VHD files stored as page blobs in an Azure storage account. The amount of data disks that can be attached to a VM is determined by the VM size:
Data Disks Allowed
What type of cache settings can be set on the disks and how can I change them?
Windows Azure disks support three different cache settings:
- Read Only: Reads and writes are cached for future reads but writes are persisted directly to storage
- Read Write: Reads and writes are cached for future reads. Non-write-through writes are persisted to the local cache first, then lazily flushed to the Windows Azure Blob service. For SQL Server, writes are always persisted to Windows Azure Storage because it uses write-through
- None (disabled): Requests bypass the cache completely.
Caching is OFF by default for both read operations and write operations for data disks. Caching is ON by default for read and write operations for operating system disks. To change this setting use the Azure management Cmdlets
When should you place databases on the operating System Disk?
Smaller databases (10 GB or smaller) that require a low number of concurrent read I/Os can be placed on the operating system disk (C drive). A common scenario of this would be on a development, testing, or staging server.
Should you place TempDB on the Azure VM’s D drive?
Do not place TempDB on the D drive because performance on the D drive is not guaranteed to be predictable. TempDB performance can be very critical to overall SQL Server performance under certain workloads so placing it on the D drive can cause performance bottlenecks.
So where should TempDB be placed?
Either place tempDB on the operating system disk or place it on a data disk. The choice will depend on whether your workload makes much use of TempDB or not.
Where should the Blobs backing my VHDs be located?
You should place them in the same location (datacenter) as your hosted services. Note that the location is determined at the storage account level.
How can I improve the performance of my disks?
- Place each transaction log on a separate its own VHD.
- For OLTP type loads (many small reads and writes) performance will scale linearly as you add more VHDs.
- For large sequential I/O writes (e.g. BCP or SSIS data imports) writes will scale linearly as you add more VHDs.
- Turn on row or page level compression to reduce the amount of data transferred to and from disk.
- For OLTP type loads keep the cache set to the default value of “None” (disabled). This is the default for data disks.
- For the operating system disk, use the “Read Write” cache setting. This is the default for operating system disks.
- If your IO workload requires very fast reads and few concurrent reads turn on read only cache.
What should I do if I have very large database files?
If you anticipate that your database file(s) will exceed more than 1TB in size you should configure the server to put the data files across multiple VHDs.
How can spread my databases across multiple VHDs?
- Create one or more dynamic volumes consisting of multiple VHDs at the operating system level and put your database files on these volumes.
- Split your tables and indexes into separate database filegroups and then place those filegroups in different files (MDF + NDF) which are spread across separately attached VHDs. Note that this will not offer any benefit for transaction logs because SQL Server does not stripe across transaction log files but uses them sequentially.
How can I baseline my disk performance to ensure good disk performance?
The best tool to simulate SQL Server type IO requests is by using SQLIO. For a quick check you can also use a tools like CrystalDiskMark. Run the baseline on your Azure VM before you put it into production and if possible do a comparison with your existing on premise SQL Server. Another way to compare your disk performance is to use the instaDBA SQL Server Baseline.
What is the warm-up effect?
With Windows Azure disks, there is a “warm-up effect” that can result in a reduced rate of throughput and bandwidth for a short period of time. In situations where a data disk is not accessed for a period of time (approximately 20 minutes), adaptive partitioning and load balancing mechanisms kick in. If the disk is accessed while these algorithms are active, you may notice some degradation in throughput and bandwidth for a period of time (approximately 10 minutes)
Can the storage account creation date have an impact on disk performance?
Yes, storage accounts created after June 7th, 2012 will make use of Azure’s 2nd generation infrastructure which provides higher transactions and bandwidth than 1st generation. You can find out the storage account creation date in the bottom right of your storage account dashboard, in the “quick glance” section.