In articles analyzing the pros and cons of NAS vs. SAN, the issue of file-level vs. block-level access often comes up. In particular, it's often said that databases operate "at the block-level". What is unique about the way a database accesses information that makes block-level access desirable? In a SAN environment, can a client request/receive individual pieces of the database (select blocks) rather than the entire database? I'm not certain where the efficiencies come in.
The issue is really one of buffering. Typically file systems or virtual memory managers cache data in the processor memory in order to improve performance. Since they don't know anything about the application that is running, they use some generalized best efforts. The application (in this case the database) knows more about the data access and can be written to do its own buffering (or caching if you like). To do that, the file system caching must be bypassed. This requires what is termed "raw I/O" by the database where it will directly do the block I/O to the device (
). There have been great improvements in file systems such that is not as significant an issue as it used to be.
Read Greg Schulz' answer to this question.
Dig Deeper on SAN technology and arrays
What is the one hidden gotcha that you'd advise users about if they were shopping for an all-flash storage array?
How much control do you have with all-flash storage arrays? How much control do you have over how arrays handle your data? Do you control the caching?
Vendors often publish numbers for 'usable' capacity versus 'effective' capacity. Can you explain this and how can you plan flash capacity needs with ...
Have a question for an expert?
Please add a title for your question
Get answers from a TechTarget expert on whatever's puzzling you.