# Binary Search - 二分搜索
二分搜索是一種在有序數組中尋找目標值的經典方法,也就是說使用前提是『有序數組』。非常簡單的題中『有序』特征非常明顯,但更多時候可能需要我們自己去構造『有序數組』。下面我們從最基本的二分搜索開始逐步深入。
### 模板一 - lower/upper bound
定義 lower bound 為在給定升序數組中大于等于目標值的最小索引,upper bound 則為小于等于目標值的最大索引,下面上代碼和測試用例。
### Java
~~~
import java.util.*;
public class Main {
public static void main(String[] args) {
int[] nums = new int[]{1,2,2,3,4,6,6,6,13,18};
System.out.println(lowerBound(nums, 6)); // 5
System.out.println(upperBound(nums, 6)); // 7
System.out.println(lowerBound(nums, 7)); // 8
System.out.println(upperBound(nums, 7)); // 7
}
/*
* nums[index] >= target, min(index)
*/
public static int lowerBound(int[] nums, int target) {
if (nums == null || nums.length == 0) return -1;
int lb = -1, ub = nums.length;
while (lb + 1 < ub) {
int mid = lb + (ub - lb) / 2;
if (nums[mid] < target) {
lb = mid;
} else {
ub = mid;
}
}
return lb + 1;
}
/*
* nums[index] <= target, max(index)
*/
public static int upperBound(int[] nums, int target) {
if (nums == null || nums.length == 0) return -1;
int lb = -1, ub = nums.length;
while (lb + 1 < ub) {
int mid = lb + (ub - lb) / 2;
if (nums[mid] > target) {
ub = mid;
} else {
lb = mid;
}
}
return ub - 1;
}
}
~~~
### 源碼分析
以`lowerBound`的實現為例,以上二分搜索的模板有幾個非常優雅的實現:
1. `while` 循環中 `lb + 1 < ub`, 而不是等號,因為取等號可能會引起死循環。初始化`lb < ub` 時,最后循環退出時一定有`lb + 1 == ub`.
1. `mid = lb + (ub - lb) / 2`, 可有效防止兩數相加后溢出。
1. `lb` 和 `ub` 的初始化,初始化為數組的兩端以外,這種初始化方式比起`0` 和`nums.length - 1` 有不少優點,詳述如下。
如果遇到有問插入索引的位置時,可以分三種典型情況:
1. 目標值在數組范圍之內,最后返回值一定是`lb + 1`
1. 目標值比數組最小值還小,此時`lb` 一直為`-1`, 故最后返回`lb + 1` 也沒錯,也可以將`-1` 理解為數組前一個更小的值
1. 目標值大于等于數組最后一個值,由于循環退出條件為`lb + 1 == lb`, 那么循環退出時一定有`lb = A.length - 1`, 應該返回`lb + 1`
綜上所述,返回`lb + 1`是非常優雅的實現。其實以上三種情況都可以統一為一種方式來理解,即索引`-1` 對應于數組前方一個非常小的數,索引`ub` 即對應數組后方一個非常大的數,那么要插入的數就一定在`lb` 和`ub` 之間了。
**有時復雜的邊界條件處理可以通過『補項』這種優雅的方式巧妙處理。**
關于lb 和 ub 的初始化,由于`mid = lb + (ub - lb) / 2`, 且有`lb + 1 < ub`,故 mid 還是有可能為`ub - 1`或者`lb + 1`的,在需要訪問`mid + 1`或者`mid - 1`處索引的元素時可能會越界。這時候就需要將初始化方式改為`lb = 0, ub = A.length - 1` 了,最后再加一個關于`lb, ub` 處索引元素的判斷即可。如 [Search for a Range](http://algorithm.yuanbin.me/zh-cn/binary_search/search_for_a_range.html) 和 [Find Peak Element](http://algorithm.yuanbin.me/zh-cn/binary_search/find_peak_element.html). 尤其是 Find Peak Element 中 lb 和 ub 的初始值如果初始化為-1和數組長度會帶來一些麻煩。
### 模板二 - 最優解
除了在有序數組中尋找目標值這種非常直接的二分搜索外,我們還可以利用二分搜索求最優解(最大值/最小值),通常這種題中只是隱含了『有序數組』,需要我們自己構造。
用數學語言來描述就是『求滿足某條件 C(x)C(x)C(x) 的最小/大的 xxx』,以求最小值為例,對于任意滿足條件的 xxx, 如果所有的 x≤x′≤UBx \leq x^\prime \leq UBx≤x′≤UB 對于 C(x′)C(x^\prime)C(x′) 都為真(其中 `UB` 可能為無窮大,也可能為滿足條件的最大的解,如果不滿足此條件就不能保證二分搜索的正確性),那么我們就能使用二分搜索進行求解,其中初始化時下界`lb` 初始化為不滿足條件的值`LB`, 上界初始化為滿足條件的上界`UB`. 隨后在`while` 循環內部每次取中,滿足條件就取`ub = mid`, 否則`lb = mid`, 那么最后`ub` 就是要求的最小值。求最大值時類似,只不過處理的是`lb`.
以 [POJ No.1064](http://poj.org/problem?id=1064) 為例。
### Problem
有 NNN 條繩子,它們的長度分別為 LiL_iLi. 如果從它們中切割出 KKK 條長度相同的繩子的話,這 KKK 條繩子每條最長能有多長?答案保留到小數點后兩位。
#### 輸入
~~~
N = 4, L = {8.02, 7.43, 4.57, 5.39}, K = 11
~~~
#### 輸出
2.00
### 題解
這道題看似是一個最優化問題,我們來嘗試下使用模板二的思想求解,**令 C(x)C(x)C(x) 為『可以得到 KKK 條長度為 xxx 的繩子』。**根據題意,我們可以將上述條件進一步細化為:C(x)=∑i(floor(Li/x))≥KC(x) = \sum_i(floor(L_i / x)) \geq KC(x)=i∑(floor(Li/x))≥K
我們現在來分析下可行解的上下界。由于答案保留小數點后兩位,顯然繩子長度一定大于0,大于0的小數點后保留兩位的最小值為`0.01`, 顯然如果問題最后有解,`0.01` 一定是可行解中最小的,且這個解可以分割出的繩子條數是最多的。一般在 OJ 上不同變量都是會給出范圍限制,那么我們將上界初始化為`最大范圍 + 0.01`, 它一定在可行解之外(也可以遍歷一遍數組取數組最大值,但其實二分后復雜度相差不大)。使用二分搜索后最后返回`lb` 即可。
### Java
~~~
import java.io.*;
import java.util.*;
public class Main {
public static void main(String[] args) {
Scanner in = new Scanner(System.in);
int n = in.nextInt();
int k = in.nextInt();
double[] nums = new double[n];
for (int i = 0; i < n; i++) {
nums[i] = in.nextDouble();
}
System.out.printf("%.2f\n", Math.floor(solve(nums, k) * 100) / 100);
}
public static double solve(double[] nums, int K) {
double lb = 0.00, ub = 10e5 + 0.01;
// while (lb + 0.001 < ub) {
for (int i = 0; i < 100; i++) {
double mid = lb + (ub - lb) / 2;
if (C(nums, mid, K)) {
lb = mid;
} else {
ub = mid;
}
}
return lb;
}
public static boolean C(double[] nums, double seg, int k) {
int count = 0;
for (double num : nums) {
count += Math.floor(num / seg);
}
return count >= k;
}
}
~~~
### 源碼分析
方法`C` 只做一件事,給定數組`nums`, 判斷是否能切割出`K` 條長度均為`seg` 的繩子。`while` 循環中使用`lb + 0.001 < ub`, 不能使用`0.01`, 因為計算`mid` 時有均值的計算,對于`double` 型數值否則會有較大誤差。
### 模板三 - 二分搜索的 `while` 結束條件判定
對于整型我們通常使用`lb + 1 < ub`, 但對于`double`型數據來說會有些精度上的丟失,使得結束條件不是那么好確定。像上題中采用的方法是題目中使用的精度除10。但有時候這種精度可能還是不夠,如果結束條件`lb + EPS < ub`中使用的 EPS 過小時 double 型數據精度有可能不夠從而導致死循環的產生!這時候我們將`while`循環體替換為`for (int i = 0; i < 100; i++)`, 100 次循環后可以達到 10?3010^{-30}10?30 精度范圍,一般都沒問題。
### Reference
- 《挑戰程序設計競賽》
- Preface
- Part I - Basics
- Basics Data Structure
- String
- Linked List
- Binary Tree
- Huffman Compression
- Queue
- Heap
- Stack
- Set
- Map
- Graph
- Basics Sorting
- Bubble Sort
- Selection Sort
- Insertion Sort
- Merge Sort
- Quick Sort
- Heap Sort
- Bucket Sort
- Counting Sort
- Radix Sort
- Basics Algorithm
- Divide and Conquer
- Binary Search
- Math
- Greatest Common Divisor
- Prime
- Knapsack
- Probability
- Shuffle
- Basics Misc
- Bit Manipulation
- Part II - Coding
- String
- strStr
- Two Strings Are Anagrams
- Compare Strings
- Anagrams
- Longest Common Substring
- Rotate String
- Reverse Words in a String
- Valid Palindrome
- Longest Palindromic Substring
- Space Replacement
- Wildcard Matching
- Length of Last Word
- Count and Say
- Integer Array
- Remove Element
- Zero Sum Subarray
- Subarray Sum K
- Subarray Sum Closest
- Recover Rotated Sorted Array
- Product of Array Exclude Itself
- Partition Array
- First Missing Positive
- 2 Sum
- 3 Sum
- 3 Sum Closest
- Remove Duplicates from Sorted Array
- Remove Duplicates from Sorted Array II
- Merge Sorted Array
- Merge Sorted Array II
- Median
- Partition Array by Odd and Even
- Kth Largest Element
- Binary Search
- Binary Search
- Search Insert Position
- Search for a Range
- First Bad Version
- Search a 2D Matrix
- Search a 2D Matrix II
- Find Peak Element
- Search in Rotated Sorted Array
- Search in Rotated Sorted Array II
- Find Minimum in Rotated Sorted Array
- Find Minimum in Rotated Sorted Array II
- Median of two Sorted Arrays
- Sqrt x
- Wood Cut
- Math and Bit Manipulation
- Single Number
- Single Number II
- Single Number III
- O1 Check Power of 2
- Convert Integer A to Integer B
- Factorial Trailing Zeroes
- Unique Binary Search Trees
- Update Bits
- Fast Power
- Hash Function
- Count 1 in Binary
- Fibonacci
- A plus B Problem
- Print Numbers by Recursion
- Majority Number
- Majority Number II
- Majority Number III
- Digit Counts
- Ugly Number
- Plus One
- Linked List
- Remove Duplicates from Sorted List
- Remove Duplicates from Sorted List II
- Remove Duplicates from Unsorted List
- Partition List
- Two Lists Sum
- Two Lists Sum Advanced
- Remove Nth Node From End of List
- Linked List Cycle
- Linked List Cycle II
- Reverse Linked List
- Reverse Linked List II
- Merge Two Sorted Lists
- Merge k Sorted Lists
- Reorder List
- Copy List with Random Pointer
- Sort List
- Insertion Sort List
- Check if a singly linked list is palindrome
- Delete Node in the Middle of Singly Linked List
- Rotate List
- Swap Nodes in Pairs
- Remove Linked List Elements
- Binary Tree
- Binary Tree Preorder Traversal
- Binary Tree Inorder Traversal
- Binary Tree Postorder Traversal
- Binary Tree Level Order Traversal
- Binary Tree Level Order Traversal II
- Maximum Depth of Binary Tree
- Balanced Binary Tree
- Binary Tree Maximum Path Sum
- Lowest Common Ancestor
- Invert Binary Tree
- Diameter of a Binary Tree
- Construct Binary Tree from Preorder and Inorder Traversal
- Construct Binary Tree from Inorder and Postorder Traversal
- Subtree
- Binary Tree Zigzag Level Order Traversal
- Binary Tree Serialization
- Binary Search Tree
- Insert Node in a Binary Search Tree
- Validate Binary Search Tree
- Search Range in Binary Search Tree
- Convert Sorted Array to Binary Search Tree
- Convert Sorted List to Binary Search Tree
- Binary Search Tree Iterator
- Exhaustive Search
- Subsets
- Unique Subsets
- Permutations
- Unique Permutations
- Next Permutation
- Previous Permuation
- Unique Binary Search Trees II
- Permutation Index
- Permutation Index II
- Permutation Sequence
- Palindrome Partitioning
- Combinations
- Combination Sum
- Combination Sum II
- Minimum Depth of Binary Tree
- Word Search
- Dynamic Programming
- Triangle
- Backpack
- Backpack II
- Minimum Path Sum
- Unique Paths
- Unique Paths II
- Climbing Stairs
- Jump Game
- Word Break
- Longest Increasing Subsequence
- Palindrome Partitioning II
- Longest Common Subsequence
- Edit Distance
- Jump Game II
- Best Time to Buy and Sell Stock
- Best Time to Buy and Sell Stock II
- Best Time to Buy and Sell Stock III
- Best Time to Buy and Sell Stock IV
- Distinct Subsequences
- Interleaving String
- Maximum Subarray
- Maximum Subarray II
- Longest Increasing Continuous subsequence
- Longest Increasing Continuous subsequence II
- Graph
- Find the Connected Component in the Undirected Graph
- Route Between Two Nodes in Graph
- Topological Sorting
- Word Ladder
- Bipartial Graph Part I
- Data Structure
- Implement Queue by Two Stacks
- Min Stack
- Sliding Window Maximum
- Longest Words
- Heapify
- Problem Misc
- Nuts and Bolts Problem
- String to Integer
- Insert Interval
- Merge Intervals
- Minimum Subarray
- Matrix Zigzag Traversal
- Valid Sudoku
- Add Binary
- Reverse Integer
- Gray Code
- Find the Missing Number
- Minimum Window Substring
- Continuous Subarray Sum
- Continuous Subarray Sum II
- Longest Consecutive Sequence
- Part III - Contest
- Google APAC
- APAC 2015 Round B
- Problem A. Password Attacker
- Microsoft
- Microsoft 2015 April
- Problem A. Magic Box
- Problem B. Professor Q's Software
- Problem C. Islands Travel
- Problem D. Recruitment
- Microsoft 2015 April 2
- Problem A. Lucky Substrings
- Problem B. Numeric Keypad
- Problem C. Spring Outing
- Microsoft 2015 September 2
- Problem A. Farthest Point
- Appendix I Interview and Resume
- Interview
- Resume