Skip to main content

Huffman Coding

 

Huffman coding is a lossless data compression algorithm named after its inventor, David A. Huffman. It is a variable-length prefix coding algorithm, which means that it assigns shorter codes to the more frequently occurring symbols in a dataset and longer codes to the less frequently occurring symbols. This results in a smaller overall size of the compressed data.

How it Works

The Huffman coding algorithm starts by building a frequency table of all the symbols in the dataset, which shows the number of occurrences of each symbol. Then, it creates a binary tree with each symbol represented by a leaf node. The parent node of two children represents the sum of their frequencies. The process continues until there is only one node left, which is the root of the tree.

Each leaf node in the tree is assigned a unique binary code, where a 0 is assigned to the left child and a 1 is assigned to the right child. The code for each symbol is the path from the root of the tree to the corresponding leaf node, where each edge is either a 0 or a 1. The symbols with the higher frequency are assigned shorter codes, which results in a smaller compressed size.

Example

Suppose we have a dataset that contains the following symbols: "A", "B", "C", "D", and "E", and the frequency of each symbol is as follows: "A" (45), "B" (13), "C" (12), "D" (16), and "E" (9). The frequency table would look like this:

Symbol Frequency
A 45
B 13
C 12
D 16
E 9

The algorithm would start by building the binary tree, which would look like this:


       *
      / \\
     /   \\
    /     \\
   *       *
  / \\     / \\
 /   \\   /   \\
A     B C     D
       |
       E

Finally, the algorithm would assign binary codes to each symbol, which would look like this:

Symbol Binary Code
A 0
B 100
C 101
D 11
E 1100

Applications and Real-World Examples

Huffman coding is widely used in various fields, including data compression, image and video compression, and communication networks. For example, the GIF image format uses Huffman coding to compress its data. The MP3 audio format also uses Huffman coding to reduce the size of the audio data.

Alternatives and New Developments

There are several alternatives to Huffman coding, including Arithmetic coding, Shannon-Fano coding, and Run Length Encoding. In recent years, there have been many new developments in the field of data compression, including the use of neural networks for compression and the development of lossy compression algorithms that use machine learning to remove less significant data from the dataset.

Conclusion

Huffman coding is a powerful and widely used data compression algorithm that assigns shorter codes to more frequently occurring symbols in a dataset. It is used in a variety of applications, including image and video compression, communication networks, and data compression. The algorithm is relatively simple to implement and can result in significant reductions in the size of the compressed data.

Comments

Popular posts from this blog

Mahabharat

प्रेम व प्रथम विवाह  प्रेम व प्रथम... प्रेम व विवाह  प्रेम व विवाह  शान्तनु शान्तनु गंगा  गंगा  देवव्रत देवव्रत past encounter past encounter सत्यवती  सत्यवती  ऋषि परासर  ऋषि परासर  वेद व्यास  वेद व्यास  आठवाँ पुत्र  आठवाँ पुत्र  प्रथम ७ पुत्रों को नदी में बहाया  प्रथम ७ पुत्रों को नदी में बहाया  पुत्र  पुत्र  विवाह पूर्व पुत्र  विवाह पूर्व पुत्र  पुत्री  पुत्री  दासराज  दासराज  पुत्र  पुत्र  पुत्र  पुत्र  संतान  संतान  चित्रांग्ध  चित्रांग्ध  विवाह  विवाह  विवाह  विवाह  विचित्रविर्य  विचित्रविर्य  विवाह को तिरस्कार किया  विवाह को तिरस्कार किया  प्रेमी  प्रेमी  अम्बा  अम्बा  अंबिका अंबिका loses her color on seeing vyas loses her color... अम्बालिका  अम्बालिका  तिरस्कार किया  तिरस्कार किया  शाल्व्य  शाल्व्य  विवाह  विवाह  गोद  ले लिया  गोद  ले लिया  गांधारी की दासी से पुत्र  गांधारी की दासी से पुत्र  धृतराष्ट्र धृतराष्ट्र प्रथम पुत्र(जन्म से अंधा) प्रथम पुत्र(जन्म से अंधा) नियोग क्रिया  नियोग क्रिया  पुत्र  पुत्र  नियोग क्रिया  नियोग क्रिया  विवाह  विवाह  विवाह  विवाह  पांडु  पांडु  अंबि...

Stock Analysis: L&T Finance Holdings

  Overview L&T Finance Holdings Ltd. (LTFH) is a financial holding company offering a focused range of financial products and services across rural housing and wholesale finance sectors as well as mutual fund products and wealth management services through its wholly-owned subsidiaries viz. L&T Finance Ltd. L&T Housing Finance Ltd. L&T Infrastructure Finance Company Ltd. L&T Investment Management Ltd. L&T Capital Markets Ltd. and L&T Infra Debt Fund Ltd.   Details of business L&T Finance Holdings Limited is one of India’s most valued and fastest-growing Non-Banking Financial Companies (NBFCs). Incorporated in 2008 and headquartered in Mumbai, the Company offers a diverse range of financial products and services in Rural, Housing and Infrastructure finance sectors. It also offers Investment management services.   Business Principles Prudent ALM(Asset and Liability Management), Adequate liquidity and well established liability franchise Highe...

आ तमाशा तू भी देख

देखने वाले देखते हैं, सब कुछ देखते हैं ये लोग देख देख कुछ करते नहीं, जाने कहाँ से लगा ये रोग ।  गरीब देखा, पीड़ित देखा, देखे उनके खेत बंजर फर्क उनको कुछ पड़ा नहीं, देख किसानों का ये मंजर झूठ वादा, झूठे काम, किसानों के प्रति झूठा सम्मान सब देख मंद मुस्काते हैं, चाहे फांद गला लटके किसान हिन्दू देखा, मुस्लिम देखा, देखी जाने कितनी जाती पर जिससे इंसान दिखें, ऐसी कला कहाँ उनको आती देखने वाले देखते हैं, सब कुछ देखते हैं ये लोग देख देख कुछ करते नहीं, जाने कहाँ से लगा ये रोग ।  घर में देखा, ऑफिस में देखा, देखा ओलंपिक्स में परचम लहराते चाहे जितने हुनर उनके देखे, पर कसी फब्तियां आते जाते कल के दुश्मन आज हैं भाई, गले पड़े भुला के सब लफड़े लेकर ठेका आदर्शवाद का, नाप रहे दूजों के कपडे अधरों पे बेशर्मी का पर्दा, जो पीड़ित है उसी की गलती देख देख इन बड़बोलों को, दानवों की कमी कहाँ है खलती देखने वाले देखते हैं, सब कुछ देखते हैं ये लोग देख देख कुछ करते नहीं, जाने कहाँ से लगा ये रोग ।  सड़क नहीं, बिजली नहीं, जनता का पैसा, उनकी जेब जहाँ देखो वहीँ ...

Krishna ki chetawani -- कृष्ण की चेतावनी

कृष्ण की चेतावनी -- रामधारी सिंह दिनकर  वर्षों तक वन में घूम घूम बाधा विघ्नों को चूम चूम सह धूप घाम पानी पत्थर पांडव आये कुछ और निखर सौभाग्य न सब दिन सोता है देखें आगे क्या होता है मैत्री की राह दिखाने को सब को सुमार्ग पर लाने को दुर्योधन को समझाने को भीषण विध्वंस बचाने को भगवान हस्तिनापुर आए पांडव का संदेशा लाये दो न्याय अगर तो आधा दो पर इसमें भी यदि बाधा हो तो दे दो केवल पाँच ग्राम रखो अपनी धरती तमाम हम वहीँ खुशी से खायेंगे परिजन पे असी ना उठाएंगे दुर्योधन वह भी दे ना सका आशीष समाज की ले न सका उलटे हरि को बाँधने चला जो था असाध्य साधने चला जब नाश मनुज पर छाता है पहले विवेक मर जाता है हरि ने भीषण हुँकार किया अपना स्वरूप विस्तार किया डगमग डगमग दिग्गज डोले भगवान कुपित हो कर बोले जंजीर बढ़ा कर साध मुझे हां हां दुर्योधन बाँध मुझे ये देख गगन मुझमे लय है ये देख पवन मुझमे लय है मुझमे विलीन झंकार सकल मुझमे लय है संसार सकल अमरत्व फूलता है मुझमे संहार झूलता है मुझमे उदयाचल मेरा दीप्त भाल, भूमंडल वक्षस्थल विशाल, भुज परिधि-बन्ध को घेरे हैं, मैनाक-मेरु पग मेरे हैं। दिपते जो ग्रह नक्षत्...

BCCI fails to renew domain name for its official website

www.bcci.tv is(or should I use "was") the BCCI's official website where it posts news about Indian cricket and even live-stream the cricket events. I tried to reach to their page and it's rendering above page as the domain has expired and BCCI(Board of Control for Cricket in India) has failed to renew it within the expiry deadline. After using whois for this domain name, I got to know that the domain was registered on 2nd February 2006 and was updated again today on 4th February 2018. Will BCCI be able to regain its domain back? Update : BCCI has got its domain back and showing the contents as earlier.

Coin Flipping Puzzle: Interview Question

Golu has 100 identical coins (with head side and tail side) which he wants to donate to someone. There are many people who are aiming to get this collection of 100 coins. So Golu created a puzzle using all 100 coins and declared that the one who will solve his puzzle will get all the coins. You desperately need money so solving the puzzle is only option you have. The Problem goes like this: All 100 coins are laying flat on a table. 80 of them are heads up and remaining 20 are tails up. You can’t feel, see or in any other way find out which side is up. Split the coins into two piles(sets) such that there are the same number of tails in each pile. [Baby Hint]: If number of coins in first pile is n then other pile will have 100-n coins. First try yourself before peeking into the solution below. 

10 small business idea to start with in India

If you want to start your own business and you want to start small, here's are 10 business ideas which you can try. The following business takes less capital and more enthusiasm. 1. photography : If you have a genuine interest in photography and want to make some money out of it then there are many opportunities may land your doorstep. It ney.eds more passion than anything else to start earning from photography. You can sell your photos to stock photography sites like shutterstock.com, Fotolia.com, istockphoto.com, stocksy.com etc. You can also provide your photos to the online travel sites like holidayiq.com. There are many types of photography viz Architectural photography, Candid photography, Documentary photography, Fashion photography, Food photography, Landscape photography, Night-long exposure photography, Conceptual/ fine art photography, wedding photography etc. 2. Wedding Planner: India is a place of the largest bunch of young people and wedding ...

[Fixed] Alexa can count only from 1 to 10

Note : At the time of writing this post, Alexa had this issue which has been resolved. Alexa can now count flawlessley even in reverse. Alexa is the smart virtual assistance by Amazon. It is beating google now and Siri on non-screen devices. Alexa is hailed for its intelligence but it seems that the Alexa knows to count only from 1 to 10. By 1 to 10, I mean exactly from 1 to 10; neither more nor less. Even asking Alexa to count from 1 to 5 or 3 to 10 or 3 to 5 results in an apology from Alexa. Watch the above video to see for yourself.

JSON vs YAML

JSON JSON(JavaScript Object Notation) is a human-readable data exchange format.  JSON is built on two structures: A collection of name/value pairs. In various languages, this is realized as an object, record, struct, dictionary, hash table, keyed list, or associative array. An ordered list of values. In most languages, this is realized as an array, vector, list, or sequence. JSON's basic data types are: Number : a signed decimal number that may contain a fractional part and may use exponential E notation, but cannot include non-numbers such as NaN. The format makes no distinction between integer and floating-point. JavaScript uses a double-precision floating-point format for all its numeric values, but other languages implementing JSON may encode numbers differently. String : a sequence of zero or more Unicode characters. Strings are delimited with double-quotation marks and support a backslash escaping syntax. Boolean : either of the values true or false Arra...