Finally --- An answer of sorts from NetApp on the BCS & ZCS issue
These answers below give a great perspective on the switch back to 512 sectors, from 520 sectors. And it is worth reading all the way through.
But I still think NetApp should release reliable, repeatable and verifiable performance data so that customers can make informed, economical business decisions based on the costs and risk factors of storing D/R data on ATA disk as compared to FC disk. Additionally, since there are costs associated with additional disks and wasted disk space due to the penalty of running Dual Parity disks to protect from a parity disk failure, customers need to know what are the percentages of wasted disk space and their costs in these configurations? Is it possible that because you don't need to run DP with FC disks that in certain smaller raid configurations it could be cheaper to run FC than ATA on NetApp filers?
Finally, is there a read or write penalty to running databases on ATA disks with Dual DP and ZCS formatting, as compared to the faster Fibre channel disks with BCS formatting?
April 22nd, 2006 at 10:57 am
Cross Posted from the previous thread: From Dave Hitz, CTO, Network Appliance:
Let me take a shot at this. I asked one of our engineers to take a look at this thread as well, so if I mess up the details, hopefully he can set me right. (Hi Steve.)
Reformatting the disk drives from 512 bytes blocks to 520 byte blocks and putting the checksum right in each individual block is the best solution, because it doesn’t take any extra seeks or reads to get the chunksum data you need. This is called BCS or Block Checksum. (Most high-end storage vendors have something similar. EMC and Hitachi certainly do.)
Unfortunately, we aren’t able to format ATA drives with 520 byte blocks. Maybe someday, but not yet. So with ATA we use a different technology called Zoned Checksum (or ZCS) where we steal every Nth block on the disk and use it for the checksums. (I think N is 64, but can’t remember for sure.) This is less efficient because you have to read extra data, but it allows you to get the reliability benefits of checksums even with ATA drives, which is important because ATA drives are less reliable.
And what about the RAID-DP (DP = “double parity”)? I think that RAID-DP is a wise choice for all drives, Fibre Channel or ATA, but given that ATA drives are less reliable we make RAID-DP the default there. I’m wondering if it’s time to make it the default for Fibre Channel drives as well, but as far as I know, we haven’t done that yet.
Why sell less reliable drives? ATA drives are cheaper! If you’ve got the money, then by all means keep buying Fibre Channel drives and keep using block checksums.
On the other hand, if you want to save money, and your application can get by with a bit less performance, then the combination of RAID-DP and Zoned Checksums can make ATA drives very safe. We used to recommend ATA only for disk-based backup or for archival storage, but now that we have RAID-DP and ZCS, we see lots of customers using it for primary storage, which is why we are starting to support ATA through the entire product line, and not just in the R-Series.
************************ ******************* ********8
I agree with Jon on this, I guess it is my turn to invite Dave out for dinner to thank him for clarifying the issues so well.