# Word Ladder
### Source
- leetcode: [Word Ladder | LeetCode OJ](https://leetcode.com/problems/word-ladder/)
- lintcode: [(120) Word Ladder](http://www.lintcode.com/en/problem/word-ladder/)
### Problem
Given two words (*start* and *end*), and a dictionary, find the length ofshortest transformation sequence from *start* to *end*, such that:
1. Only one letter can be changed at a time
1. Each intermediate word must exist in the dictionary
#### Example
Given:*start* = `"hit"`*end* = `"cog"`*dict* = `["hot","dot","dog","lot","log"]`
As one shortest transformation is `"hit" -> "hot" -> "dot" -> "dog" -> "cog"`,return its length `5`.
#### Note
- Return 0 if there is no such transformation sequence.
- All words have the same length.
- All words contain only lowercase alphabetic characters.
### 題解
咋一看還以為是 Edit Distance 的變體,仔細審題后發現和動態規劃沒啥關系。題中有兩大關鍵點:一次只能改動一個字符;改動的中間結果必須出現在詞典中。那么大概總結下來共有四種情形:
1. start 和 end 相等。
1. end 在 dict 中,且 start 可以轉換為 dict 中的一個單詞。
1. end 不在 dict 中,但可由 start 或者 dict 中的一個單詞轉化而來。
1. end 無法由 start 轉化而來。
由于中間結果也必須出現在詞典中,故此題相當于圖搜索問題,將 start, end, dict 中的單詞看做圖中的節點,節點與節點(單詞與單詞)可通過一步轉化得到,可以轉換得到的節點相當于邊的兩個節點,邊的權重為1(都是通過1步轉化)。到這里問題就比較明確了,相當于搜索從 start 到 end 兩點間的最短距離,即 Dijkstra 最短路徑算法。**通過 [BFS](# "Breadth-First Search, 廣度優先搜索") 和哈希表實現。**
首先將 start 入隊,隨后彈出該節點,比較其和 end 是否相同;再從 dict 中選出所有距離為1的單詞入隊,并將所有與當前節點距離為1且未訪問過的節點(需要使用哈希表)入隊,方便下一層遍歷時使用,直至隊列為空。
### Java
~~~
public class Solution {
/**
* @param start, a string
* @param end, a string
* @param dict, a set of string
* @return an integer
*/
public int ladderLength(String start, String end, Set<String> dict) {
if (start == null && end == null) return 0;
if (start.length() == 0 && end.length() == 0) return 0;
assert(start.length() == end.length());
if (dict == null || dict.size() == 0) {
return 0;
}
int ladderLen = 1;
dict.add(end); // add end to dict, important!
Queue<String> q = new LinkedList<String>();
Set<String> hash = new HashSet<String>();
q.offer(start);
hash.add(start);
while (!q.isEmpty()) {
ladderLen++;
int qLen = q.size();
for (int i = 0; i < qLen; i++) {
String strTemp = q.poll();
for (String nextWord : getNextWords(strTemp, dict)) {
if (nextWord.equals(end)) return ladderLen;
// filter visited word in the dict
if (hash.contains(nextWord)) continue;
q.offer(nextWord);
hash.add(nextWord);
}
}
}
return 0;
}
private Set<String> getNextWords(String curr, Set<String> dict) {
Set<String> nextWords = new HashSet<String>();
for (int i = 0; i < curr.length(); i++) {
char[] chars = curr.toCharArray();
for (char c = 'a'; c <= 'z'; c++) {
chars[i] = c;
String temp = new String(chars);
if (dict.contains(temp)) {
nextWords.add(temp);
}
}
}
return nextWords;
}
}
~~~
### 源碼分析
#### `getNextWords`的實現
首先分析給定單詞`curr`并從 dict 中選出所有距離為1 的單詞。常規的思路可能是將`curr`與 dict 中的單詞逐個比較,并遍歷每個字符串,返回距離為1的單詞組。這種找距離為1的節點的方法復雜度為 l(length?of?word)×n(size?of?dict)×m(queue?length)=O(lmn)l(length\ of\ word) \times n(size\ of\ dict)\times m(queue\ length) = O(lmn)l(length?of?word)×n(size?of?dict)×m(queue?length)=O(lmn). 在 dict 較長時會 [TLE](# "Time Limit Exceeded 的簡稱。你的程序在 OJ 上的運行時間太長了,超過了對應題目的時間限制。"). 其實根據 dict 的數據結構特點,比如查找任一元素的時間復雜度可認為是 O(1)O(1)O(1). 根據哈希表和單個單詞長度通常不會太長這一特點,我們就可以根據給定單詞構造到其距離為一的單詞變體,然后查詢其是否在 dict 中,這種實現的時間復雜度為 O(26(a?to?z)×l×m)=O(lm)O(26(a\ to\ z) \times l \times m) = O(lm)O(26(a?to?z)×l×m)=O(lm), 與 dict 長度沒有太大關系,大大優化了時間復雜度。
經驗教訓:根據給定數據結構特征選用合適的實現,遇到哈希表時多用其查找的 O(1)O(1)O(1) 特性。
#### [BFS](# "Breadth-First Search, 廣度優先搜索") 和哈希表的配合使用
[BFS](# "Breadth-First Search, 廣度優先搜索") 用作搜索,哈希表用于記錄已經訪問節點。在可以改變輸入數據的前提下,需要將 end 加入 dict 中,否則對于不在 dict 中出現的 end 會有問題。
### 復雜度分析
主要在于`getNextWords`方法的時間復雜度,時間復雜度 O(lmn)O(lmn)O(lmn)。使用了隊列存儲中間處理節點,空間復雜度平均條件下應該是常量級別,當然最壞條件下可能惡化為 O(n)O(n)O(n), 即 dict 中某個點與其他點距離均為1.
### Reference
- [Word Ladder 參考程序 Java/C++/Python](http://www.jiuzhang.com/solutions/word-ladder/)
- [Java Solution using Dijkstra's algorithm, with explanation - Leetcode Discuss](https://leetcode.com/discuss/50930/java-solution-using-dijkstras-algorithm-with-explanation)
- Preface
- Part I - Basics
- Basics Data Structure
- String
- Linked List
- Binary Tree
- Huffman Compression
- Queue
- Heap
- Stack
- Set
- Map
- Graph
- Basics Sorting
- Bubble Sort
- Selection Sort
- Insertion Sort
- Merge Sort
- Quick Sort
- Heap Sort
- Bucket Sort
- Counting Sort
- Radix Sort
- Basics Algorithm
- Divide and Conquer
- Binary Search
- Math
- Greatest Common Divisor
- Prime
- Knapsack
- Probability
- Shuffle
- Basics Misc
- Bit Manipulation
- Part II - Coding
- String
- strStr
- Two Strings Are Anagrams
- Compare Strings
- Anagrams
- Longest Common Substring
- Rotate String
- Reverse Words in a String
- Valid Palindrome
- Longest Palindromic Substring
- Space Replacement
- Wildcard Matching
- Length of Last Word
- Count and Say
- Integer Array
- Remove Element
- Zero Sum Subarray
- Subarray Sum K
- Subarray Sum Closest
- Recover Rotated Sorted Array
- Product of Array Exclude Itself
- Partition Array
- First Missing Positive
- 2 Sum
- 3 Sum
- 3 Sum Closest
- Remove Duplicates from Sorted Array
- Remove Duplicates from Sorted Array II
- Merge Sorted Array
- Merge Sorted Array II
- Median
- Partition Array by Odd and Even
- Kth Largest Element
- Binary Search
- Binary Search
- Search Insert Position
- Search for a Range
- First Bad Version
- Search a 2D Matrix
- Search a 2D Matrix II
- Find Peak Element
- Search in Rotated Sorted Array
- Search in Rotated Sorted Array II
- Find Minimum in Rotated Sorted Array
- Find Minimum in Rotated Sorted Array II
- Median of two Sorted Arrays
- Sqrt x
- Wood Cut
- Math and Bit Manipulation
- Single Number
- Single Number II
- Single Number III
- O1 Check Power of 2
- Convert Integer A to Integer B
- Factorial Trailing Zeroes
- Unique Binary Search Trees
- Update Bits
- Fast Power
- Hash Function
- Count 1 in Binary
- Fibonacci
- A plus B Problem
- Print Numbers by Recursion
- Majority Number
- Majority Number II
- Majority Number III
- Digit Counts
- Ugly Number
- Plus One
- Linked List
- Remove Duplicates from Sorted List
- Remove Duplicates from Sorted List II
- Remove Duplicates from Unsorted List
- Partition List
- Two Lists Sum
- Two Lists Sum Advanced
- Remove Nth Node From End of List
- Linked List Cycle
- Linked List Cycle II
- Reverse Linked List
- Reverse Linked List II
- Merge Two Sorted Lists
- Merge k Sorted Lists
- Reorder List
- Copy List with Random Pointer
- Sort List
- Insertion Sort List
- Check if a singly linked list is palindrome
- Delete Node in the Middle of Singly Linked List
- Rotate List
- Swap Nodes in Pairs
- Remove Linked List Elements
- Binary Tree
- Binary Tree Preorder Traversal
- Binary Tree Inorder Traversal
- Binary Tree Postorder Traversal
- Binary Tree Level Order Traversal
- Binary Tree Level Order Traversal II
- Maximum Depth of Binary Tree
- Balanced Binary Tree
- Binary Tree Maximum Path Sum
- Lowest Common Ancestor
- Invert Binary Tree
- Diameter of a Binary Tree
- Construct Binary Tree from Preorder and Inorder Traversal
- Construct Binary Tree from Inorder and Postorder Traversal
- Subtree
- Binary Tree Zigzag Level Order Traversal
- Binary Tree Serialization
- Binary Search Tree
- Insert Node in a Binary Search Tree
- Validate Binary Search Tree
- Search Range in Binary Search Tree
- Convert Sorted Array to Binary Search Tree
- Convert Sorted List to Binary Search Tree
- Binary Search Tree Iterator
- Exhaustive Search
- Subsets
- Unique Subsets
- Permutations
- Unique Permutations
- Next Permutation
- Previous Permuation
- Unique Binary Search Trees II
- Permutation Index
- Permutation Index II
- Permutation Sequence
- Palindrome Partitioning
- Combinations
- Combination Sum
- Combination Sum II
- Minimum Depth of Binary Tree
- Word Search
- Dynamic Programming
- Triangle
- Backpack
- Backpack II
- Minimum Path Sum
- Unique Paths
- Unique Paths II
- Climbing Stairs
- Jump Game
- Word Break
- Longest Increasing Subsequence
- Palindrome Partitioning II
- Longest Common Subsequence
- Edit Distance
- Jump Game II
- Best Time to Buy and Sell Stock
- Best Time to Buy and Sell Stock II
- Best Time to Buy and Sell Stock III
- Best Time to Buy and Sell Stock IV
- Distinct Subsequences
- Interleaving String
- Maximum Subarray
- Maximum Subarray II
- Longest Increasing Continuous subsequence
- Longest Increasing Continuous subsequence II
- Graph
- Find the Connected Component in the Undirected Graph
- Route Between Two Nodes in Graph
- Topological Sorting
- Word Ladder
- Bipartial Graph Part I
- Data Structure
- Implement Queue by Two Stacks
- Min Stack
- Sliding Window Maximum
- Longest Words
- Heapify
- Problem Misc
- Nuts and Bolts Problem
- String to Integer
- Insert Interval
- Merge Intervals
- Minimum Subarray
- Matrix Zigzag Traversal
- Valid Sudoku
- Add Binary
- Reverse Integer
- Gray Code
- Find the Missing Number
- Minimum Window Substring
- Continuous Subarray Sum
- Continuous Subarray Sum II
- Longest Consecutive Sequence
- Part III - Contest
- Google APAC
- APAC 2015 Round B
- Problem A. Password Attacker
- Microsoft
- Microsoft 2015 April
- Problem A. Magic Box
- Problem B. Professor Q's Software
- Problem C. Islands Travel
- Problem D. Recruitment
- Microsoft 2015 April 2
- Problem A. Lucky Substrings
- Problem B. Numeric Keypad
- Problem C. Spring Outing
- Microsoft 2015 September 2
- Problem A. Farthest Point
- Appendix I Interview and Resume
- Interview
- Resume