<ruby id="bdb3f"></ruby>

    <p id="bdb3f"><cite id="bdb3f"></cite></p>

      <p id="bdb3f"><cite id="bdb3f"><th id="bdb3f"></th></cite></p><p id="bdb3f"></p>
        <p id="bdb3f"><cite id="bdb3f"></cite></p>

          <pre id="bdb3f"></pre>
          <pre id="bdb3f"><del id="bdb3f"><thead id="bdb3f"></thead></del></pre>

          <ruby id="bdb3f"><mark id="bdb3f"></mark></ruby><ruby id="bdb3f"></ruby>
          <pre id="bdb3f"><pre id="bdb3f"><mark id="bdb3f"></mark></pre></pre><output id="bdb3f"></output><p id="bdb3f"></p><p id="bdb3f"></p>

          <pre id="bdb3f"><del id="bdb3f"><progress id="bdb3f"></progress></del></pre>

                <ruby id="bdb3f"></ruby>

                ??一站式輕松地調用各大LLM模型接口,支持GPT4、智譜、豆包、星火、月之暗面及文生圖、文生視頻 廣告
                # 10-散列 [原文鏈接](http://code.google.com/p/guava-libraries/wiki/HashingExplained) [譯文鏈接](http://ifeve.com/google-guava-hashing) **譯者:**沈義揚 ## 概述 Java內建的散列碼[hash code]概念被限制為32位,并且沒有分離散列算法和它們所作用的數據,因此很難用備選算法進行替換。此外,使用Java內建方法實現的散列碼通常是劣質的,部分是因為它們最終都依賴于JDK類中已有的劣質散列碼。 Object.hashCode往往很快,但是在預防碰撞上卻很弱,也沒有對分散性的預期。這使得它們很適合在散列表中運用,因為額外碰撞只會帶來輕微的性能損失,同時差勁的分散性也可以容易地通過再散列來糾正(Java中所有合理的散列表都用了再散列方法)。然而,在簡單散列表以外的散列運用中,Object.hashCode幾乎總是達不到要求——因此,有了[com.google.common.hash](http://docs.guava-libraries.googlecode.com/git-history/release/javadoc/com/google/common/hash/package-summary.html)包。 ## 散列包的組成 在這個包的Java doc中,我們可以看到很多不同的類,但是文檔中沒有明顯地表明它們是怎樣 一起配合工作的。在介紹散列包中的類之前,讓我們先來看下面這段代碼范例: ``` HashFunction hf = Hashing.md5(); HashCode hc = hf.newHasher() .putLong(id) .putString(name, Charsets.UTF_8) .putObject(person, personFunnel) .hash(); ``` ### HashFunction [HashFunction](http://docs.guava-libraries.googlecode.com/git-history/release/javadoc/com/google/common/hash/HashFunction.html)是一個單純的(引用透明的)、無狀態的方法,它把任意的數據塊映射到固定數目的位指,并且保證相同的輸入一定產生相同的輸出,不同的輸入盡可能產生不同的輸出。 ### Hasher HashFunction的實例可以提供有狀態的[Hasher](http://docs.guava-libraries.googlecode.com/git-history/release/javadoc/com/google/common/hash/Hasher.html),Hasher提供了流暢的語法把數據添加到散列運算,然后獲取散列值。Hasher可以接受所有原生類型、字節數組、字節數組的片段、字符序列、特定字符集的字符序列等等,或者任何給定了Funnel實現的對象。 Hasher實現了PrimitiveSink接口,這個接口為接受原生類型流的對象定義了fluent風格的API ### Funnel Funnel描述了如何把一個具體的對象類型分解為原生字段值,從而寫入PrimitiveSink。比如,如果我們有這樣一個類: ``` class Person { final int id; final String firstName; final String lastName; final int birthYear; } ``` 它對應的Funnel實現可能是: ``` Funnel<Person> personFunnel = new Funnel<Person>() { @Override public void funnel(Person person, PrimitiveSink into) { into .putInt(person.id) .putString(person.firstName, Charsets.UTF_8) .putString(person.lastName, Charsets.UTF_8) .putInt(birthYear); } } ``` 注:putString(“abc”, Charsets.UTF_8).putString(“def”, Charsets.UTF_8)完全等同于putString(“ab”, Charsets.UTF_8).putString(“cdef”, Charsets.UTF_8),因為它們提供了相同的字節序列。這可能帶來預料之外的散列沖突。增加某種形式的分隔符有助于消除散列沖突。 ### HashCode 一旦Hasher被賦予了所有輸入,就可以通過[hash()](http://docs.guava-libraries.googlecode.com/git-history/release/javadoc/com/google/common/hash/Hasher.html#hash%28%29)方法獲取[HashCode](http://docs.guava-libraries.googlecode.com/git-history/release/javadoc/com/google/common/hash/HashCode.html)實例(多次調用hash()方法的結果是不確定的)。HashCode可以通過[asInt()](http://docs.guava-libraries.googlecode.com/git-history/release/javadoc/com/google/common/hash/HashCode.html#asInt%28%29)、[asLong()](http://docs.guava-libraries.googlecode.com/git-history/release/javadoc/com/google/common/hash/HashCode.html#asLong%28%29)、[asBytes()](http://docs.guava-libraries.googlecode.com/git-history/release/javadoc/com/google/common/hash/HashCode.html#asBytes%28%29)方法來做相等性檢測,此外,[writeBytesTo(array, offset, maxLength)](http://docs.guava-libraries.googlecode.com/git-history/release/javadoc/com/google/common/hash/HashCode.html#writeBytesTo%28byte[], int, int%29)把散列值的前maxLength字節寫入字節數組。 ## 布魯姆過濾器[BloomFilter] 布魯姆過濾器是哈希運算的一項優雅運用,它可以簡單地基于Object.hashCode()實現。簡而言之,布魯姆過濾器是一種概率數據結構,它允許你檢測某個對象是一定不在過濾器中,還是可能已經添加到過濾器了。[布魯姆過濾器的維基頁面](http://en.wikipedia.org/wiki/Bloom_filter)對此作了全面的介紹,同時我們推薦github中的一個[教程](http://billmill.org/bloomfilter-tutorial/)。 Guava散列包有一個內建的布魯姆過濾器實現,你只要提供Funnel就可以使用它。你可以使用[create(Funnel funnel, int expectedInsertions, double falsePositiveProbability)](http://docs.guava-libraries.googlecode.com/git-history/release/javadoc/com/google/common/hash/BloomFilter.html#create%28com.google.common.hash.Funnel, int, double%29)方法獲取[BloomFilter&lt;T&gt;](http://docs.guava-libraries.googlecode.com/git-history/release/javadoc/com/google/common/hash/BloomFilter.html),缺省誤檢率[falsePositiveProbability]為3%。BloomFilter&lt;T&gt;提供了[boolean mightContain(T)](http://docs.guava-libraries.googlecode.com/git-history/release/javadoc/com/google/common/hash/BloomFilter.html#mightContain%28T%29) 和[void put(T)](http://docs.guava-libraries.googlecode.com/git-history/release/javadoc/com/google/common/hash/BloomFilter.html#put%28T%29),它們的含義都不言自明了。 ``` BloomFilter<Person> friends = BloomFilter.create(personFunnel, 500, 0.01); for(Person friend : friendsList) { friends.put(friend); } // 很久以后 if (friends.mightContain(dude)) { //dude不是朋友還運行到這里的概率為1% //在這兒,我們可以在做進一步精確檢查的同時觸發一些異步加載 } ``` ## Hashing類 Hashing類提供了若干散列函數,以及運算HashCode對象的工具方法。 ### 已提供的散列函數 | [`md5()`](http://docs.guava-libraries.googlecode.com/git-history/release12/javadoc/com/google/common/hash/Hashing.html#md5%28%29) | [`murmur3_128()`](http://docs.guava-libraries.googlecode.com/git-history/release12/javadoc/com/google/common/hash/Hashing.html#murmur3_128%28%29) | [`murmur3_32()`](http://docs.guava-libraries.googlecode.com/git-history/release12/javadoc/com/google/common/hash/Hashing.html#murmur3_32%28%29) | [`sha1()`](http://docs.guava-libraries.googlecode.com/git-history/release12/javadoc/com/google/common/hash/Hashing.html#sha1%28%29) | |:--- |:--- |:--- |:--- | | [`sha256()`](http://docs.guava-libraries.googlecode.com/git-history/release12/javadoc/com/google/common/hash/Hashing.html#sha256%28%29) | [`sha512()`](http://docs.guava-libraries.googlecode.com/git-history/release12/javadoc/com/google/common/hash/Hashing.html#sha512%28%29) | [`goodFastHash(int bits)`](http://docs.guava-libraries.googlecode.com/git-history/release12/javadoc/com/google/common/hash/Hashing.html#goodFastHash%28int%29) | ### HashCode運算 | **方法** | **描述** | |:--- |:--- | | [`HashCode combineOrdered( Iterable&lt;HashCode&gt;)`](http://docs.guava-libraries.googlecode.com/git-history/release12/javadoc/com/google/common/hash/Hashing.html#combineOrdered%28java.lang.Iterable%29) | 以有序方式聯接散列碼,如果兩個散列集合用該方法聯接出的散列碼相同,那么散列集合的元素可能是順序相等的 | | [`HashCode ? combineUnordered( Iterable&lt;HashCode&gt;)`](http://docs.guava-libraries.googlecode.com/git-history/release12/javadoc/com/google/common/hash/Hashing.html#combineUnordered%28java.lang.Iterable%29) | 以無序方式聯接散列碼,如果兩個散列集合用該方法聯接出的散列碼相同,那么散列集合的元素可能在某種排序下是相等的 | | [`int ? consistentHash( HashCode, int buckets)`](http://docs.guava-libraries.googlecode.com/git-history/release12/javadoc/co…le/common/hash/Hashing.html#consistentHash%28com.google.common.hash.HashCode, int%29) | 為給定的”桶”大小返回一致性哈希值。當”桶”增長時,該方法保證最小程度的一致性哈希值變化。詳見[一致性哈希](http://en.wikipedia.org/wiki/Consistent_hashing)。 |
                  <ruby id="bdb3f"></ruby>

                  <p id="bdb3f"><cite id="bdb3f"></cite></p>

                    <p id="bdb3f"><cite id="bdb3f"><th id="bdb3f"></th></cite></p><p id="bdb3f"></p>
                      <p id="bdb3f"><cite id="bdb3f"></cite></p>

                        <pre id="bdb3f"></pre>
                        <pre id="bdb3f"><del id="bdb3f"><thead id="bdb3f"></thead></del></pre>

                        <ruby id="bdb3f"><mark id="bdb3f"></mark></ruby><ruby id="bdb3f"></ruby>
                        <pre id="bdb3f"><pre id="bdb3f"><mark id="bdb3f"></mark></pre></pre><output id="bdb3f"></output><p id="bdb3f"></p><p id="bdb3f"></p>

                        <pre id="bdb3f"><del id="bdb3f"><progress id="bdb3f"></progress></del></pre>

                              <ruby id="bdb3f"></ruby>

                              哎呀哎呀视频在线观看