This is a subject that computer science students research and explain to get their PHDs. For an e-mail question, the best I can do is give you a very light overview.
Dictionary-based compression algorithms usually create a dictionary (a pattern of characters) in memory as data is scanned looking for repeated information (some implementations use a static dictionary so it does have to be built dynamically). Based on the pattern recognition (a look-up in the dictionary), that string of information is replaced by a much shorter but uniquely identifiable string. This results in a compression of that overall data. The size of the dictionary and the speed at which the scan is done is an implementation decision from the different vendors. It's a trade off between cost and latency. There are many techniques for doing this. The most popular compression algorithm is the Limpel-Ziv of which there are several versions. Run-Length-Encoding is a form of this with looking for repeated characters. Huffman encoding used a mathematical probability of character occurrence for representation by smaller bit strings.
This is a whole computer science discipline with many very good textbooks. I suggest buying a couple of those and reading further.
Evaluator Group, Inc.
Editor's note: Do you agree with this expert's response? If you have more to share, post it in our Storage Networking discussion forum.
Dig deeper on Data management tools
Related Q&A from Randy Kerns
Learn about NAS security and if SAN is more secure than NAS in this expert response.continue reading
Learn the definition of N_Port ID virtualization (NPIV) in this expert response by Randy Kerns.continue reading
Learn about whether or not cloud storage services are a good choice for primary storage in this expert response.continue reading
Have a question for an expert?
Please add a title for your question
Get answers from a TechTarget expert on whatever's puzzling you.