哈夫曼編碼 · 數據結構與算法（JS版)

[toc] # 幾個概念 ## 路徑在一棵樹中，從一個結點到另一個結點所經過的所有結點，被我們稱為兩個結點之間的路徑。比如：A -> H 的路徑： ![](https://img.kancloud.cn/1d/d4/1dd44197a23aa5aa130a0e573339e6eb_1236x1084.png) ## 路徑長度在一棵樹中，從一個結點到另一個結點所經過的“邊”的數量，被我們稱為兩個結點之間的路徑長度。比如：A -> H的路徑長度為 3： ![](https://img.kancloud.cn/34/bc/34bc723d1197eb6f40b5d7ab31ab0819_1216x1076.png) ## 節點的帶權路徑長度每一個節點都可以有一個權重（數字）。帶權路徑長度：節點權重 * 節點到根節點的路徑長度比如：H 的帶權路徑長度 = 3 * 3 = 9 ![](https://img.kancloud.cn/25/4d/254d6b2f2a066c713251a34f913be694_1248x1128.png) ## 樹的帶權路徑長度樹的帶權路徑長度（也做WPL）：所有 `葉子` 節點的帶權路徑長度之和。比如：下面這棵樹的 WPL = 3*3 + 6*3 + 1*2 + 4*2 + 8*2 = 53 ![](https://img.kancloud.cn/57/dc/57dc02737778338ab788f7f00ff1a43d_1220x1148.png) ## 哈夫曼樹哈夫曼樹（Huffman Tree）：是在葉子結點和權重確定的情況下，帶權路徑長度最小的二叉樹，也被稱為最優二叉樹。比如下圖左側就比右側 WPL 小。 ![](https://img.kancloud.cn/92/ad/92ade9107dccfa537b6f5aac934bd33d_1210x620.png) 總結：核心特點：權重越大的節點，越靠近根節點。面試官問：什么是最優二叉樹？什么是哈夫曼樹？ ## 哈夫曼編碼哈夫曼編碼可以用來進行 `數據壓縮` 。 ### 編碼比如要壓縮：“acbcbacddaddaddccd” 1. 先統計文本中每個字符出現的頻率 ![](https://img.kancloud.cn/e3/ef/e3ef35efcf04208adb20ae55136fc516_502x506.png) 2. 頻率當作權重構建哈夫曼樹 ![](https://img.kancloud.cn/de/30/de3015365e17db53e820f89cb9364ae1_640x904.png) 3. 根據哈夫曼樹對字符串進行編碼編碼規則： 1. 左子節點是0，右子節點是1 2. 從根節點到葉子結果根據哈夫曼樹可以將字符串編碼為： * a:111 * ac:1111 0 * acb:1111 0110 * acbc:1111 011010 * acbcb:1111 0110 1011 0… 說明：編碼之后好像是變長了，但注意這里的存儲單位是不同的： - 1 字節(Byte) = 6 位(bit) - 一個字母 = 1 字節 = 8 位 - 一個漢字 = 2~4字節 = 16~32位 - 而1010這個哈夫曼編碼，單位是位，一個1或0只帶1 位。所以綜合：哈夫曼編碼的壓縮率在 20% ~ 90% 之間。 ### 解碼編碼過程就是： 1. 將哈夫曼編碼到哈夫曼樹上 “從根向下跑一遍” 2. 0 向左、 1 向右 3. 當遇到一個 “葉子” 節點時結束 4. 重復從根開始跑，重復 1~3 比如：1111 0110 1011 0... 到樹上跑得到：acbcb ## 用途 1. 數據壓縮 2. 加密、解密 # 構建哈夫曼樹的流程 1. 將帶權節點 `按升序` 放到數組中 ![](https://img.kancloud.cn/12/8b/128b5929fcb6921078de7bf385f0a283_240x934.png) 2. 取出數組中最小的兩個節點做為子節點構造一棵樹，和為父節點 ![](https://img.kancloud.cn/73/d8/73d8ff811a3ba7a711b98cc118321a2a_728x998.png) 3. 將父節點的值放回數組中并 `排序` ![](https://img.kancloud.cn/16/fc/16fce61c87f9ed15056c1bd92e3d6626_698x758.png) 4. 重復2~3步驟拿出最小兩個節點： ![](https://img.kancloud.cn/bd/c0/bdc0ec78d98a96918c1cfe61d4a84af2_800x790.png) 將的父節點放回并排序 ![](https://img.kancloud.cn/fa/bb/fabbcc40b85f411f2c7469373179446d_802x724.png) 5. 重復2~3 ![](https://img.kancloud.cn/19/9e/199ea6adfdb224a03b9021f49f4f76d0_832x810.png) ![](https://img.kancloud.cn/a1/22/a122afea34ac0cb71a3c004df3a51275_802x744.png) 6. 重復2~3 ![](https://img.kancloud.cn/89/ca/89ca511e426fddb9138a888fb724d4ae_958x908.png) ![](https://img.kancloud.cn/da/28/da28935b7f8cff23b6356dbb61e84bc2_1002x938.png) 7. 重復2~3 ![](https://img.kancloud.cn/bd/38/bd38c53f244326d64357c98d280b391e_1220x1138.png) ![](https://img.kancloud.cn/0b/1f/0b1f9c52ef4b2e234cbc8b1c9923800f_1200x1156.png) 最后數組中只剩下一個節點，說明構造結束！ # 代碼實現 - 構造哈夫曼樹 ~~~ class Node { constructor(data) { this.data = data this.left = null this.right = null } } class HuffmanTree { constructor(arr) { this.queue = arr this.root = null let first, second, parent while(this.queue.length>1) { // 排序 this.queue.sort((a,b)=>a-b) first = this.queue.shift() second = this.queue.shift() parent = first + second this.queue.unshift(parent) parent = new Node(parent) first = new Node(first) second = new Node(second) if(this.root === null) { parent.left = first parent.right = second this.root = parent } else { if(this.root.data == first.data) { parent.left = this.root parent.right = second this.root = parent } else { parent.left = first parent.right = this.root this.root = parent } } } } } ~~~