We all know how annoying it is to wait for ages when an email attachment is downloading and, ideally, we want our files to take up as little storage as possible. That way, they can be shared quickly, or we can store more on the same storage medium. With data compression, we can do just that for even the largest files.
What is data compression? In a nutshell, it is the process of reducing the size of files. Compressing data reduces the number of used to represent information. The overall goal of data compression is to represent information using the least amount of bits.
The compression of data occurs thanks to smart algorithms and advanced mathematics
How Does Data Compression Work?
Data compression algorithms decrease the number of symbols needed to encode source information. Therefore, the process of data compression reduces the storage space required on the hard drive.
There are two general processes in data compression: an encoding algorithm and a decoding algorithm. The encoding algorithm takes the source information and creates a compressed representation. Whereas, the decoding algorithm reconstructs the original data from the compressed representation.
The process of decoding and encoding can result in the loss of information. Computer scientists categorize compression algorithms by how much data they lose in their processing.
Types of Compression
There are two types of data compression: lossy and lossless.
In Losseless data compression, there is no loss of information from encoding/decoding. After decompression, the information is the same as the original. There is no loss of data during the entire process. Lossless data compression algorithms eliminate redundancy in encoding.
For instance, an image may contain a large blue area. Instead of encoding the information repeatedly, (blue pixel, blue pixel, and so on), the algorithm encodes the data as (300 blue pixels). A lossless algorithm successfully reduces the storage requirements without losing any information.
People use this type of data compression when they want an exact replication of the original information.
For example, text files can have their meaning completely changed if the information is not retained perfectly. There is a vast difference between the phrases "do not click that button" and "do now click that button."
Lossy data compression involves the loss of information. This means that the decompressed file is different than the original file. Lossy algorithms both reduce redundancy and remove information.
Lossy compression usually frees up more space than lossless compression. Unfortunately, this free space sometimes comes at a cost.
Lossy compression often causes the quality of the image or sound to decrease.
Lossy algorithms are used in cases when the loss of data is undetectable to human perception. Images and music files can have some information whittled down without drastically changing the listening or viewing experience. Losing a little bit of resolution is acceptable in many cases.
Lossy algorithms don't always result in less quality. Consider an algorithm that removed the frequencies outside of human perception from a music track. Sure, it won't sound the same to your dog, but you won't be able to tell the difference, and now the file is smaller. Lossy algorithms are not always bad, and they have many uses.
The Importance of Compression
In the late 1940s, Claude Shannon, the father of Information Theory, was wondering how to transmit information efficiently. From his research, he developed the field of study known today as data compression.
The developments coming out of data compression studies has ignited a communication revolution. This field of study has shaped the internet communication, and video communication.
Today, the internet relies on file compression. When you download images online, they are compressed, generally in JPEG or GIF formats. Our videos are usually compressed in MPEG format.
Many of us take the ubiquity of data compression for granted. Without the field of data compression, cellphones, satellite tv's, and fax machines would never have been invented. There is no doubting the importance of compressing data in communications technology.
If you are interested in exploring this topic more deeply, we recommend the book entitled "Introduction to Data Compression." Written by Khalid Sayood, it will provide you with a comprehensive exploration of the subject, and there is so much more to learn!
Have you had to work with data compression programs before? Were they lossless or lossy, and how did that affect your files?
Let us know in the comments.