Arithmetic Coding For Image Compression In Matlab

In this article, we will discuss the basics of arithmetic coders, center on the Approximate Entropy Coding and also provide you with MATLAB code for image compression using arithmetic coding.

Do you want to compress your images in Matlab? The problem is, do you want it to be fast? I see a lot of tutorials that teach how to use a poor method of compression. Is there a way to compress an image as fast as possible? Yes, but it involves Matlab’s built in arithmetic coding module. If you are not familiar with this module, then this tutorial is for you.

It’s always fun to learn new coding languages, applications, and methods. This blog will show you how to use Matlab to compress images. The blog shows you the steps in coding this as well as providing the full code listing and explanation. If you are interested in learning something new and are a beginning programmer or math student, this blog is a great place to start. You will quickly pick up the fundamentals of image compression and using Matlab for the task.

Matlab Code For Image Compression Using Arithmetic Coding

Arithmetic coding is a type of entropy encoding utilized in lossless data compression. Ordinarily, a string of characters, for example, the words “hey” is represented for utilizing a fixed number of bits per character.

In the most straightforward case, the probability of every symbol occurring is equivalent. For instance, think about a set of three symbols, A, B, and C, each similarly prone to happen. Straightforward block encoding would require 2 bits for every symbol, which is inefficient as one of the bit varieties is rarely utilized. In other words, A = 00, B = 01, and C = 10, however, 11 is unused. An increasingly productive arrangement is to speak to a succession of these three symbols as a rational number in base 3 where every digit speaks to a symbol. For instance, the arrangement “ABBCAB” could become 0.011201. In arithmetic coding as an incentive in the stretch [0, 1). The subsequent stage is to encode this ternary number utilizing a fixed-guide paired number of adequate exactness toward recuperating it, for example, 0.00101100102 — this is just 10 bits. This is achievable for long arrangements in light of the fact that there are productive, set up calculations for changing over the base of subjectively exact numbers.

When all is said and done, arithmetic coders can deliver close ideal output for some random arrangement of symbols and probabilities (the ideal value is – log2P bits for every symbol of likelihood P). Compression algorithms that utilize arithmetic coding go by deciding a structure of the data – fundamentally an expectation of what examples will be found in the symbols of the message. The more precise this prediction is, the closer to ideal the output will be.

When all is said and done, each progression of the encoding procedure, aside from the absolute last, is the equivalent; the encoder has fundamentally only three bits of information to consider: The following symbol that should be encoded. The current span (at the very beginning of the encoding procedure, the stretch is set to [0,1], yet that will change) The probabilities the model allows to every one of the different symbols that are conceivable at this stage (as referenced prior, higher-request or versatile models imply that these probabilities are not really the equivalent in each progression.) The encoder isolates the current span into sub-spans, each speaking to a small amount of the current span relative to the likelihood of that symbol in the current setting. Whichever stretch relates to the real symbol that is close to being encoded turns into the span utilized in the subsequent stage. At the point when the sum total of what symbols have been encoded, the subsequent span unambiguously recognizes the arrangement of symbols that delivered it. Any individual who has a similar last span and model that is being utilized can remake the symbol succession that more likely than not entered the encoder to bring about that last stretch. It isn’t important to transmit the last stretch, in any case; it is just important to transmit one division that exists in that span. Specifically, it is just important to transmit enough digits (in whatever base) of the part so all divisions that start with those digits fall into the last stretch; this will ensure that the subsequent code is a prefix code.

https://1f2ebde800fa2d99872761258ff3673f.safeframe.googlesyndication.com/safeframe/1-0-38/html/container.html

Encoding

  • MATLAB
% input stringstr = 'GeeksforGeeks';fprintf('The entered string is : %s\n', str);% length of the stringlen = length(str);fprintf('The length of the string is : %d\n', len);% get unique characters from the stringu = unique(str);fprintf('The unique characters are : %s\n', u);% length of the unique characters stringlen_unique = length(u);fprintf('The length of unique character string is : %d\n', len_unique);% General lookup table% get zeros of length of unique charactersz = zeros(1, len_unique);p = zeros(1, len_unique);fori = 1 : len_unique   % in 'z' variable we will find the   % occurrence of each characters from 'str'    z(i) = length(findstr(str, u(i)));   % in 'p' variable we will get   % probability of those occurrences   p(i) = z(i) / len;enddisplay(z);display(p);% in 'cpr' variable we will get the cumulative% summation of 'p' from '1' till last value of 'p'cpr = cumsum(p);% in 'newcpr' variable we are taking% 'cpr' from '0' till last value of 'p'newcpr = [0 cpr];display(cpr);display(newcpr);% make table till 'len_unique' sizefori = 1 : len_unique   % in first column we are then   % placing 'newcpr' values   interval(i, 1) = newcpr(i);   % in second column we are   % placing 'cpr' values   interval(i, 2) = cpr(i);end% Displaying the lookup tabledisplay('The lookip table is : ')display(interval);% Encoder Tablelow = 0;high = 1;fori = 1 : len   forj = 1 : len_unique       % if the value from 'str'       % matches with 'u' then       ifstr(i) == u(j);           pos = j;           j = j + 1;            % displaying the matched length           % of unique characters           display(pos);           % getting the tag value           % of the matched character           range = high - low;           high = low + (range .* interval(pos, 2));           low = low + (range .* interval(pos, 1));           i = i + 1;           break       end   endend% displaying tag valuetag = low;display(tag);

Output :

The entered string is : GeeksforGeeks
The length of the string is : 13
The unique characters are : Gefkors
The length of unique character string is : 7

z =

     2     4     1     2     1     1     2


p =

    0.1538    0.3077    0.0769    0.1538    0.0769    0.0769    0.1538


cpr =

    0.1538    0.4615    0.5385    0.6923    0.7692    0.8462    1.0000


newcpr =

         0    0.1538    0.4615    0.5385    0.6923    0.7692    0.8462    1.0000

The lookip table is : 

interval =

         0    0.1538
    0.1538    0.4615
    0.4615    0.5385
    0.5385    0.6923
    0.6923    0.7692
    0.7692    0.8462
    0.8462    1.0000


pos =

     1


pos =

     2


pos =

     2


pos =

     4


pos =

     7


pos =

     3


pos =

     5


pos =

     6


pos =

     1


pos =

     2


pos =

     2


pos =

     4


pos =

     7


tag =

    0.0409

Decoding 

  • MATLAB
str = '';fori = 1 : len   forj = 1 : len_unique            % If tag value falls between a certain       % range in lookup table then       iftag > interval(j, 1) && tag < interval(j, 2)           pos = j;           tag = (tag - interval(pos, 1)) / p(j);           % Getting the matched tag           % value character           decoded_str = u(pos);           % String concatenating           % 'decoded_str' into 'str'           str = strcat(str, decoded_str);           break       end   endend% Displaying final decoded stringdisp(str)

Output :

Leave a Comment