6.sizzle編譯 · jQuery源碼構架

[toc] ## compile 講了這么久的 Sizzle，總感覺差了那么一口氣，對于一個 selector，我們把它生成 tokens，進行優化，優化的步驟包括去頭和生成 seed 集合。對于這些種子集合，我們知道最后的匹配結果是來自于集合中的一部分，似乎接下來的任務也已經明確：對種子進行過濾（或者稱其為匹配）。匹配的過程其實很簡單，就是對 DOM 元素進行判斷，而且弱是那種一代關系（>）或臨近兄弟關系（+），不滿足，就結束，若為后代關系（space）或者兄弟關系（~），會進行多次判斷，要么找到一個正確的，要么結束，不過仍需要考慮回溯問題。比如div > div.seq h2 ~ p，已經對應的把它們劃分成 tokens，如果每個 seed 都走一遍流程顯然太麻煩。一種比較合理的方法就是對應每個可判斷的 token 生成一個閉包函數，統一進行查找。 Expr.filter 是用來生成匹配函數的，它大概長這樣： ``` Expr.filter = { "ID": function(id){...}, "TAG": function(nodeNameSelector){...}, "CLASS": function(className){...}, "ATTR": function(name, operator, check){...}, "CHILD": function(type, what, argument, first, last){...}, "PSEUDO": function(pseudo, argument){...} } ``` 看兩個例子，一切都懂了： ``` Expr.filter["ID"] = function( id ) { var attrId = id.replace( runescape, funescape ); //這里返回一個函數 return function( elem ) { return elem.getAttribute("id") === attrId; }; }; ``` 對 tokens 分析，讓發現其 type 為 ID，則把其 id 保留，并返回一個檢查函數，參數為 elem，用于判斷該 DOM 的 id 是否與 tokens 的 id 一致。這種做法的好處是，編譯一次，執行多次。那么，是這樣的嗎？我們再來看看其他例子： ``` Expr.filter["TAG"] = function( nodeNameSelector ) { var nodeName = nodeNameSelector.replace( runescape, funescape ).toLowerCase(); return nodeNameSelector === "*" ? //返回一個函數 function() { return true; } : // 參數為 elem function( elem ) { return elem.nodeName && elem.nodeName.toLowerCase() === nodeName; }; }; Expr.filter["ATTR"] = function( name, operator, check ) { // 返回一個函數 return function( elem ) { var result = Sizzle.attr( elem, name ); if ( result == null ) { return operator === "!="; } if ( !operator ) { return true; } result += ""; return operator === "=" ? result === check : operator === "!=" ? result !== check : operator === "^=" ? check && result.indexOf( check ) === 0 : operator === "*=" ? check && result.indexOf( check ) > -1 : operator === "$=" ? check && result.slice( -check.length ) === check : operator === "~=" ? ( " " + result.replace( rwhitespace, " " ) + " " ).indexOf( check ) > -1 : operator === "|=" ? result === check || result.slice( 0, check.length + 1 ) === check + "-" : false; }; }, ``` 最后的返回結果： - `input[type=button]` 屬性 type 類型為 button； - `input[type!=button]` 屬性 type 類型不等于 button； - `input[name^=pre]` 屬性 name 以 pre 開頭； - `input[name*=new]` 屬性 name 中包含 new； - `input[name$=ed]` 屬性 name 以 ed 結尾； - `input[name=~=new]` 屬性 name 有用空格分離的 new； - `input[name|=zh]` 屬性 name 要么等于 zh，要么以 zh 開頭且后面有關連字符 -。所以對于一個 token，即生成了一個閉包函數，該函數接收的參數為一個 DOM，用來判斷該 DOM 元素是否是符合 token 的約束條件，比如 id 或 className 等等。如果將多個 token （即 tokens）都這么來處理，會得到一個專門用來判斷的函數數組，這樣子對于 seed 中的每一個元素，就可以用這個函數數組對其父元素或兄弟節點挨個判斷，效率大大提升，即所謂的編譯一次，多次使用。 ## compile 源碼直接貼上 compile 函數代碼，這里會有 matcherFromTokens 和 matcherFromGroupMatchers 這兩個函數，也一并介紹了。 ``` var compile = function(selector, match) { var i,setMatchers = [],elementMatchers = [], cached = compilerCache[selector + " "]; // 判斷有沒有緩存，好像每個函數都會判斷 if (!cached) { if (!match) { // 判斷 match 是否生成 tokens match = tokenize(selector); } i = match.length; while (i--) { // 這里將 tokens 交給了這個函數 cached = matcherFromTokens(match[i]); if (cached[expando]) { setMatchers.push(cached); } else { elementMatchers.push(cached); } } // 放到緩存 cached = compilerCache( selector, // 這個函數生成最終的匹配器 matcherFromGroupMatchers(elementMatchers, setMatchers) ); // Save selector and tokenization cached.selector = selector; } return cached; }; ``` 編譯 compile 函數貌似很簡單，來看 matcherFromTokens： ``` function matcherFromTokens(tokens) { var checkContext,matcher,j,len = tokens.length,leadingRelative = Expr.relative[tokens[0].type], implicitRelative = leadingRelative || Expr.relative[" "], i = leadingRelative ? 1 : 0, // 確保元素都能找到 // addCombinator 就是對 Expr.relative 進行判斷 /* Expr.relative = { ">": { dir: "parentNode", first: true }, " ": { dir: "parentNode" }, "+": { dir: "previousSibling", first: true }, "~": { dir: "previousSibling" } }; */ matchContext = addCombinator( function(elem) { return elem === checkContext; },implicitRelative,true), matchAnyContext = addCombinator( function(elem) { return indexOf(checkContext, elem) > -1; },implicitRelative,true), matchers = [ function(elem, context, xml) { var ret = !leadingRelative && (xml || context !== outermostContext) || ((checkContext = context).nodeType ? matchContext(elem, context, xml) : matchAnyContext(elem, context, xml)); // Avoid hanging onto element (issue #299) checkContext = null; return ret; } ]; for (; i < len; i++) { // 處理 "空 > ~ +" if (matcher = Expr.relative[tokens[i].type]) { matchers = [addCombinator(elementMatcher(matchers), matcher)]; } else { // 處理 ATTR CHILD CLASS ID PSEUDO TAG，filter 函數在這里 matcher = Expr.filter[tokens[i].type].apply(null, tokens[i].matches); // Return special upon seeing a positional matcher // 偽類會把selector分兩部分 if (matcher[expando]) { // Find the next relative operator (if any) for proper handling j = ++i; for (; j < len; j++) { if (Expr.relative[tokens[j].type]) { break; } } return setMatcher( i > 1 && elementMatcher(matchers), i > 1 && toSelector( // If the preceding token was a descendant combinator, insert an implicit any-element `*` tokens .slice(0, i - 1) .concat({value: tokens[i - 2].type === " " ? "*" : ""}) ).replace(rtrim, "$1"), matcher, i < j && matcherFromTokens(tokens.slice(i, j)), j < len && matcherFromTokens(tokens = tokens.slice(j)), j < len && toSelector(tokens) ); } matchers.push(matcher); } } return elementMatcher(matchers); } ``` 其中 addCombinator 函數用于生成 curry 函數，來解決 Expr.relative 情況： ``` function addCombinator(matcher, combinator, base) { var dir = combinator.dir, skip = combinator.next, key = skip || dir, checkNonElements = base && key === "parentNode", doneName = done++; return combinator.first ? // Check against closest ancestor/preceding element function(elem, context, xml) { while (elem = elem[dir]) { if (elem.nodeType === 1 || checkNonElements) { return matcher(elem, context, xml); } } return false; } : // Check against all ancestor/preceding elements function(elem, context, xml) { var oldCache, uniqueCache, outerCache, newCache = [dirruns, doneName]; // We can't set arbitrary data on XML nodes, so they don't benefit from combinator caching if (xml) { while (elem = elem[dir]) { if (elem.nodeType === 1 || checkNonElements) { if (matcher(elem, context, xml)) { return true; } } } } else { while (elem = elem[dir]) { if (elem.nodeType === 1 || checkNonElements) { outerCache = elem[expando] || (elem[expando] = {}); // Support: IE <9 only // Defend against cloned attroperties (jQuery gh-1709) uniqueCache = outerCache[elem.uniqueID] || (outerCache[elem.uniqueID] = {}); if (skip && skip === elem.nodeName.toLowerCase()) { elem = elem[dir] || elem; } else if ((oldCache = uniqueCache[key]) && oldCache[0] === dirruns && oldCache[1] === doneName) { // Assign to newCache so results back-propagate to previous elements return newCache[2] = oldCache[2]; } else { // Reuse newcache so results back-propagate to previous elements uniqueCache[key] = newCache; // A match means we're done; a fail means we have to keep checking if (newCache[2] = matcher(elem, context, xml)) { return true; } } } } } return false; }; } ``` 其中 elementMatcher 函數用于生成匹配器： ``` function elementMatcher(matchers) { return matchers.length > 1 ? function(elem, context, xml) { var i = matchers.length; while (i--) { if (!matchers[i](elem, context, xml)) { return false; } } return true; } : matchers[0]; } matcherFromGroupMatchers 如下： function matcherFromGroupMatchers(elementMatchers, setMatchers) { var bySet = setMatchers.length > 0, byElement = elementMatchers.length > 0, superMatcher = function(seed, context, xml, results, outermost) { var elem,j,matcher,matchedCount = 0,i = "0",unmatched = seed && [],setMatched = [], contextBackup = outermostContext, // We must always have either seed elements or outermost context elems = seed || byElement && Expr.find["TAG"]("*", outermost), // Use integer dirruns iff this is the outermost matcher dirrunsUnique = dirruns += contextBackup == null ? 1 : Math.random() || 0.1,len = elems.length; if (outermost) { outermostContext = context === document || context || outermost; } // Add elements passing elementMatchers directly to results // Support: IE<9, Safari // Tolerate NodeList properties (IE: "length"; Safari: <number>) matching elements by id for (; i !== len && (elem = elems[i]) != null; i++) { if (byElement && elem) { j = 0; if (!context && elem.ownerDocument !== document) { setDocument(elem); xml = !documentIsHTML; } while (matcher = elementMatchers[j++]) { if (matcher(elem, context || document, xml)) { results.push(elem); break; } } if (outermost) { dirruns = dirrunsUnique; } } // Track unmatched elements for set filters if (bySet) { // They will have gone through all possible matchers if (elem = !matcher && elem) { matchedCount--; } // Lengthen the array for every element, matched or not if (seed) { unmatched.push(elem); } } } // `i` is now the count of elements visited above, and adding it to `matchedCount` // makes the latter nonnegative. matchedCount += i; // Apply set filters to unmatched elements // NOTE: This can be skipped if there are no unmatched elements (i.e., `matchedCount` // equals `i`), unless we didn't visit _any_ elements in the above loop because we have // no element matchers and no seed. // Incrementing an initially-string "0" `i` allows `i` to remain a string only in that // case, which will result in a "00" `matchedCount` that differs from `i` but is also // numerically zero. if (bySet && i !== matchedCount) { j = 0; while (matcher = setMatchers[j++]) { matcher(unmatched, setMatched, context, xml); } if (seed) { // Reintegrate element matches to eliminate the need for sorting if (matchedCount > 0) { while (i--) { if (!(unmatched[i] || setMatched[i])) { setMatched[i] = pop.call(results); } } } // Discard index placeholder values to get only actual matches setMatched = condense(setMatched); } // Add matches to results push.apply(results, setMatched); // Seedless set matches succeeding multiple successful matchers stipulate sorting if (outermost && !seed && setMatched.length > 0 && matchedCount + setMatchers.length > 1) { Sizzle.uniqueSort(results); } } // Override manipulation of globals by nested matchers if (outermost) { dirruns = dirrunsUnique; outermostContext = contextBackup; } return unmatched; }; return bySet ? markFunction(superMatcher) : superMatcher; } ``` 這個過程太復雜了，請原諒我無法耐心的看完。。。先留名，以后分析。。。到此，其實已經可以結束了，但我本著負責的心態，我們再來理一下 Sizzle 整個過程。 Sizzle 雖然獨立出去，單獨成一個項目，不過在 jQuery 中的代表就是 jQuery.find 函數，這兩個函數其實就是同一個，完全等價的。然后介紹 tokensize 函數，這個函數的被稱為詞法分析，作用就是將 selector 劃分成 tokens 數組，數組每個元素都有 value 和 type 值。然后是 select 函數，這個函數的功能起著優化作用，去頭去尾，并 Expr.find 函數生成 seed 種子數組。后面的介紹就馬馬虎虎了，我本身看的也不少很懂。compile 函數進行預編譯，就是對去掉 seed 后剩下的 selector 生成閉包函數，又把閉包函數生成一個大的 superMatcher 函數，這個時候就可用這個 superMatcher(seed) 來處理 seed 并得到最終的結果。那么 superMatcher 是什么？ ## superMatcher 前面就已經說過，這才是 compile()() 函數的正確使用方法，而 compile() 的返回值即 superMatcher，無論是介紹 matcherFromTokens 還說介紹 matcherFromGroupMatchers，其結果都是為了生成超級匹配，然后處理 seed，這是一個考驗的時刻，只有經得住篩選才會留下來。 ## 總結下面是別人總結的一個流程圖： ![](http://olphpb1zg.bkt.clouddn.com/22100009-fb059131d1ed49519db4810fd8a4f20b.jpg) **第一步** ``` div > p + div.aaron input[type="checkbox"] ``` 從最右邊先通過 Expr.find 獲得 seed 數組，在這里的 input 是 TAG，所以通過 getElementsByTagName() 函數。 **第二步** 重組 selector，此時除去 input 之后的 selector： ``` div > p + div.aaron [type="checkbox"] ``` **第三步** 此時通過 Expr.relative 將 tokens 根據關系分成緊密關系和非緊密關系，比如 [">", "+"] 就是緊密關系，其 first = true。而對于 [" ", "~"] 就是非緊密關系。緊密關系在篩選時可以快速判斷。 matcherFromTokens 根據關系編譯閉包函數，為四組： ``` div > p + div.aaron input[type="checkbox"] ``` 編譯函數主要借助 Expr.filter 和 Expr.relative。 **第四步** 將所有的編譯閉包函數放到一起，生成 superMatcher 函數。 ``` function( elem, context, xml ) { var i = matchers.length; while ( i-- ) { if ( !matchers[i]( elem, context, xml ) ) { return false; } } return true; } ``` 從右向左，處理 seed 集合，如果有一個不匹配，則返回 false。如果成功匹配，則說明該 seed 元素是符合篩選條件的，返回給 results。