Discussing dictionary-based compression algorithms

What are dictionary-based compression algorithms and how do they function? Please give the details of their functioning.

This is a subject that computer science students research and explain to get their PHDs. For an e-mail question,...

the best I can do is give you a very light overview.

Dictionary-based compression algorithms usually create a dictionary (a pattern of characters) in memory as data is scanned looking for repeated information (some implementations use a static dictionary so it does have to be built dynamically). Based on the pattern recognition (a look-up in the dictionary), that string of information is replaced by a much shorter but uniquely identifiable string. This results in a compression of that overall data. The size of the dictionary and the speed at which the scan is done is an implementation decision from the different vendors. It's a trade off between cost and latency. There are many techniques for doing this. The most popular compression algorithm is the Limpel-Ziv of which there are several versions. Run-Length-Encoding is a form of this with looking for repeated characters. Huffman encoding used a mathematical probability of character occurrence for representation by smaller bit strings.

This is a whole computer science discipline with many very good textbooks. I suggest buying a couple of those and reading further.

Randy Kerns
Evaluator Group, Inc.

Editor's note: Do you agree with this expert's response? If you have more to share, post it in our Storage Networking discussion forum.

This was first published in January 2002

Dig Deeper on Data management tools



Find more PRO+ content and other member only offers, here.

Have a question for an expert?

Please add a title for your question

Get answers from a TechTarget expert on whatever's puzzling you.

You will be able to add details on the next page.



Forgot Password?

No problem! Submit your e-mail address below. We'll send you an email containing your password.

Your password has been sent to:



  • Flash technologies remain hot in 2016, experts predict

    Experts predict solid-state technology will remain hot in 2016, leading to the demise of high-speed hard disk drives, as ...

  • Tintri VMstore T5000

    Like all of its VM-aware storage systems, Tintri’s first all-flash array -- the Tintri VMstore T5000 -- allows admins to bypass ...

  • SolidFire SF9605

    The high-capacity SolidFire SF9605 uses SolidFire’s Element OS 8 (Oxygen) to deliver new enterprise features such as synchronous ...