Running Microsoft SQL Server on Oracle OCI IaaS can deliver excellent performance, but only if the storage layer is designed and validated correctly. Too often, performance issues are attributed to the database engine while the real bottleneck sits in an improperly configured or untested storage setup.
In this article, I walk through a storage configuration specifically designed for SQL Server workloads on Oracle OCI, focusing on block volume layout, attachment type, and I/O characteristics. The setup is not theoretical: it has been tested using FIO benchmarks built to realistically simulate SQL Server I/O patterns, including random reads and writes, queue depth behavior, and concurrency.
The goal is simple and practical: show which storage choices actually matter, quantify their impact with measurable data, and provide a repeatable approach to validate I/O performance before going into production. All benchmark results are included and discussed, so you can directly compare configurations and understand the trade-offs instead of relying on assumptions.
This article builds on concepts already discussed in previous posts, starting from a critical but often misunderstood topic: what āguaranteed IOPSā actually means for enterprise workloads. Guaranteed IOPS represent a minimum capability of the storage service under ideal conditions, not a promise that real-world applications will constantly reach those numbers. As explained in the first article, database workloads like SQL Server introduce randomness, mixed read/write patterns, variable block sizes, and queue depth behavior that can significantly diverge from synthetic benchmarks and marketing figures.
In addition, achieving high and consistent storage performance on Oracle OCI is not just a matter of selecting a high-performance block volume tier. As detailed in the second article, maximum disk I/O throughput on Windows instances depends on a combination of factors: instance shape network bandwidth, attachment model, number of block volumes, striping strategy, and how well the operating system can drive parallel I/O. In other words, performance is an architectural outcome, not a single configuration switch.
These two topics form the foundation for the analysis presented here. Understanding the limits of guaranteed IOPS and knowing how to unlock the available throughput at the OS and platform level are prerequisites before evaluating SQL Server storage performance. The FIO benchmarks shown later in this article are designed precisely to validate these principles under realistic SQL Serverālike workloads, using measurable data instead of assumptions.
Enough theory. Letās move straight to the numbers and compare storage performance using FIO simulation tests across different storage configurations, each aligned with the Oracle OCI IaaS shapes evaluated in this analysis.
The FIO command used to generate the workload is designed to simulate a realistic stress test for a large enterprise SQL Server database. For simplicity, I did not separate the log files onto a different volume, but I created multiple files with different block sizes to emulate concurrent and parallel I/O requests.
- Uses windowsaio with direct I/O to bypass cache.
- Generates time-based I/O for 300 seconds.
- Simulates multiple files:
- 6 x SQL data files: random read/write, 70% reads, 8K blocks, moderate queue depth and parallel jobs.
- 2 x tempdb files: random read/write, 50% reads, 8K blocks, lower concurrency.
- 1 x SQL log file: sequential writes only, 64K blocks, lower depth.
- Multiple jobs and threads emulate concurrent database workloads.
Basically a realistic, mixed SQL Server stress test covering data, tempdb, and log I/O.
fio --name=sql-data1 --filename=datafile1.dat --ioengine=windowsaio --direct=1 --rw=randrw --rwmixread=70 --bs=8K --numjobs=6 --iodepth=8 --size=40G --time_based --runtime=300 --group_reporte2.dat --ioengine=windowsaio --direct=1 odepth=8 --size=40G --time_based --runtime=3=6 --iodepth=8 --size=40G --time_based --runtime=3=6 --iodepth=8 --size=40G --time_based --runtime=300 --group_reporting --thread ^--name=sql-data3 --filename=datafile3.dat --ioengine=windowsaio --direct=1 --rw=randrw --rwmixread=70 --bs=8K --numjobs=4 --iodepth=8 --size=30G --time_based --runtime=300 --group_reporting --thread ^--name=tempdb1 --filename=tempdb1.dat --ioengine=windowsaio --direct=1 --rw=randrw --rwmixread=50 --bs=8K --numjobs=4 --iodepth=8 --size=20G --time_based --runtime=300 --group_reporting --thread ^--name=tempdb2 --filename=tempdb2.dat --ioengine=windowsaio --direct=1 --rw=randrw --rwmixread=50 --bs=8K --numjobs=2 --iodepth=4 --size=10G --time_based --runtime=300 --group_reporting --thread ^--name=sql-log --filename=sql-log.ldf --ioengine=windowsaio --direct=1 --rw=write --bs=64K --numjobs=2 --iodepth=4 --size=20G --time_based --runtime=300 --group_reporting --thread
Test Environment:
Test 1:
Shape: VM.Optimized3.Flex with 16 OCPUs and 48 GB RAM, with 1 Block Volume (120 VPUs) attached, tested both paravirtualized and with SR-IOV. WINDOWS 2022 server standard OS
Test 2:
Shape: VM.Optimized3.Flex with 16 OCPUs and 48 GB RAM, with 3 Block Volumes (120 VPUs each) attached and striped in a pool, tested both paravirtualized and with SR-IOV. WINDOWS 2022 server standard OS
Storage setup:
A) 1 Block Volume ā Paravirtualized (VPU120)
B) 1 Block Volume ā SR-IOV (VFIO) (VPU120)
C) 3 Block Volumes striped ā Paravirtualized (VPU120)
D) 3 Block Volumes striped ā SR-IOV (VFIO) (VPU120)
Recommended max latencies for SQL Server
| Average Latency (ms) | Category | Impact on SQL Server Load | Notes |
|---|---|---|---|
| < 1 ms | Excellent | No impact, extremely responsive | Ideal for NVMe or enterprise SSD volumes in high-intensity OLTP setups |
| 1 ā 2 ms | Good | SQL Server load handled smoothly | Suitable for medium OLTP/OLAP, perfect for VMs with SSD-optimized storage |
| 2 ā 5 ms | Acceptable | Possible minor delays on intensive queries | Suitable for light OLTP or test environments |
| 5 ā 10 ms | Poor | Noticeable latency, may slow transactions | Only useful for batch jobs or non-critical reports |
| > 10 ms | Critical | Performance severely degraded, risk of timeouts | Not suitable for OLTP production, only for backup or archival |
| > 20 ms | Unsustainable | Database could hang or encounter errors | Must be avoided in production |
TABLE 1 ā DIRECT COMPARISON OF ALL 4 SCENARIOS
| Metric | A) 1BV PARAVIRT | B) 1BV SR-IOV | C) 3BV PARAVIRT | D) 3BV SR-IOV |
|---|---|---|---|---|
| READ BW (MiB/s) | 770 | 714 | 1055 | 1025 |
| READ IOPS (k) | 98.6 | 91.4 | 135 | 131 |
| WRITE BW (MiB/s) | 501 | 464 | 675 | 656 |
| WRITE IOPS (k) | 53.9 | 49.9 | 73.5 | 71.4 |
| Latency READ avg (µs) | 1072.5 | 1164.0 | 779.4 | 803.1 |
| Latency READ p99 (µs) | 1926 | 2245 | 1532 | 2024 |
| Latency READ p99.9(µs) | 2671 | 2704 | 2376 | 6259 |
| Latency WRITE avg (µs) | 1201.4 | 1298.8 | 886.5 | 912.1 |
| Latency WRITE p99 (µs) | 3294 | 3458 | 2507 | 3097 |
| Latency WRITE p99.9(µs) | 5276 | 5735 | 4146 | 6456 |
| CPU usr % | 1.69 | 1.67 | 3.40 | 3.25 |
| CPU sys % | 8.14 | 8.10 | 17.93 | 17.24 |
| Total IO R (GiB) | 226 | 209 | 309 | 300 |
| Total IO W (GiB) | 147 | 136 | 198 | 192 |
Even in the worst configuration (Config 4), the server maintains excellent average latencies for SQL Server, so normal load will be handled without issues. Only the p99.9 peaks reach critical territory, which could cause brief slowdowns on very intensive queries or under maximum tempdb stress. Overall, the performance is solid and suitable for OLTP/OLAP production environments.
TABLE 2 ā DELTA: SR-IOV vs PARAVIRTUALIZED (same number of block volumes)
Baseline = Paravirtualized
BW/IOPS: positive = better | Latency: positive = worse
| Metric | 1 BV SR-IOV vs 1 BV PARAVIRUALIZED | 3 BV striped SR-IOV vs 3 BV striped PARAVIRUALIZED |
|---|---|---|
| READ BW | -7.3% | -2.8% |
| READ IOPS | -7.3% | -3.0% |
| WRITE BW | -7.4% | -2.8% |
| WRITE IOPS | -7.4% | -2.9% |
| Lat. READ avg | +8.5% (worse) | +3.0% (worse) |
| Lat. READ p99 | +16.6% (worse) | +32.1% (worse) |
| Lat. READ p99.9 | +1.2% (ā same) | +163.4% (MUCH worse) |
| Lat. WRITE avg | +8.1% (worse) | +2.9% (worse) |
| Lat. WRITE p99 | +5.0% (worse) | +23.5% (worse) |
| Lat. WRITE p99.9 | +8.7% (worse) | +55.7% (worse) |
Note: SR-IOV is ALWAYS worse than Paravirtualized across all tests.
TABLE 3 ā DELTA: 3 BV STRIPED vs 1 BV (same driver)
Baseline = 1 Block Volume
BW/IOPS: positive = better | Latency: negative = better
| Metric | Paravirtualized 3BV vs 1BV Paravirtualized | SR-IOV 3BV vs 1BV SR-IOV |
|---|---|---|
| READ BW | +37.0% | +43.6% |
| READ IOPS | +36.9% | +43.3% |
| WRITE BW | +34.7% | +41.4% |
| WRITE IOPS | +36.4% | +43.1% |
| Lat. READ avg | -27.3% (better) | -31.0% (better) |
| Lat. READ p99 | -20.5% (better) | -9.8% (better) |
| Lat. READ p99.9 | -11.0% (better) | +131.5% (MUCH worse) |
| Lat. WRITE avg | -26.2% (better) | -29.8% (better) |
| Lat. WRITE p99 | -23.9% (better) | -10.4% (better) |
| Lat. WRITE p99.9 | -21.4% (better) | +12.6% (worse) |
Note: striping always helps in the mid range (avg/p99), but under SR-IOV the p99.9 tails explode.
TABLE 4 ā FINAL RANKING (best ā worst per key metric)
| Metric | 1st (best) | 2nd | 3rd | 4th |
|---|---|---|---|---|
| READ BW | C 3BV PARA | D 3BV SRIOV | A 1BV PARA | B 1BV SRIOV |
| WRITE BW | C 3BV PARA | D 3BV SRIOV | A 1BV PARA | B 1BV SRIOV |
| READ IOPS | C 3BV PARA | D 3BV SRIOV | A 1BV PARA | B 1BV SRIOV |
| WRITE IOPS | C 3BV PARA | D 3BV SRIOV | A 1BV PARA | B 1BV SRIOV |
| Lat. READ avg | C 3BV PARA | D 3BV SRIOV | A 1BV PARA | B 1BV SRIOV |
| Lat. READ p99 | C 3BV PARA | A 1BV PARA | D 3BV SRIOV | B 1BV SRIOV |
| Lat. READ p99.9 | C 3BV PARA | A 1BV PARA | B 1BV SRIOV | D 3BV SRIOV |
| Lat. WRITE avg | C 3BV PARA | D 3BV SRIOV | A 1BV PARA | B 1BV SRIOV |
| Lat. WRITE p99 | C 3BV PARA | D 3BV SRIOV | A 1BV PARA | B 1BV SRIOV |
| Lat. WRITE p99.9 | C 3BV PARA | A 1BV PARA | B 1BV SRIOV | D 3BV SRIOV |
Configuration C (3 BV striped + Paravirtualized) wins across ALL metrics!!
Additional Considerations:
- The latency perceived by SQL Server also depends on IOPS and workload type (random read/write vs sequential).
- For mixed workloads (e.g., highly active tempdb), a stable latency below 2 ms is strongly recommended.
- OLAP batch workloads can tolerate higher latencies, while OLTP with many concurrent transactions requires minimal latency.
ANALYSIS AND RECOMMENDATIONS
1) STRIPING: always worth it
Moving from 1 to 3 Block Volumes in a striped configuration delivers a consistent gain:
- BW and IOPS increase by 35-44 % in both read and write
- Average latency drops by 26-31 %
- p99 percentiles improve by 10-24 %
Exception: tail latency (p99.9) under SR-IOV degrades drastically.
2) SR-IOV vs Paravirtualized: Paravirtualized is better
- SR-IOV bandwidth and IOPS lower by 3-7 %
- Average latency slightly higher (+3-8 %)
- p99 and p99.9 noticeably worse, especially with 3 BV striped
Possible causes: driver optimization, VFIO overhead, or Windows-level IO complexity.
3) The p99.9 “bomb” under SR-IOV + 3 BV striped
- 3BV SR-IOV R p99.9 = 6259 µs (+163% vs 3BV PARA 2376 µs)
- 3BV SR-IOV W p99.9 = 6456 µs (+55% vs 3BV PARA 4146 µs)
Critical for SQL Server transactional workloads; can stall query batches.
FINAL RECOMMENDATION
Recommended configuration to maximize performance and latency: 3 Block Volumes striped + Paravirtualized driver
Benefits:
- Highest bandwidth (1055 R / 675 W MiB/s)
- Highest IOPS (135k R / 73.5k W)
- Lowest average latency
- Most stable tail latency (p99.9)
- No risk of exploded latency spikes as seen under SR-IOV.
Avoid:
- SR-IOV + 3 BV striped
- Single BV (any driver): limits bandwidth and IOPS with no latency benefits.
CPU note:
Striping doubles the sys CPU load (from ~8 % to ~18 %).
On a VM dedicated to the DB this is not an issue, but it should be monitored if the VM shares cores with other workloads.
Please check previous posts if you want all the background on this topic.