走讀webrtc 中的視頻JitterBuffer(二) · 音視頻開發之路

## CMDecodingState VCMDecodingState 是用于判斷nalu是否可以連續解碼，判斷的依據因不同編碼格式而不同。它支持了三種編碼格式：VP8，VP9，H264，看下它定義的幾個成員變量 ~~~ uint16_t sequence_num_; uint32_t time_stamp_; int picture_id_; int temporal_id_; int tl0_pic_id_; bool full_sync_; // Sync flag when temporal layers are used. ~~~ picture\_id,temporal\_id,tl0\_pic\_id是攜帶在vp8，vp9中的信息，用于標識Nalu間的關系及是否可連續解碼。而H264并沒有攜帶這些信息，在成員函數`ContinuousFrame`中，可以看到對H264的處理邏輯。在這篇文章里也只關心H264的處理。 ### 成員函數 ContinuousFrame ~~~ bool VCMDecodingState::ContinuousFrame(const VCMFrameBuffer* frame) const { // Check continuity based on the following hierarchy: // - Temporal layers (stop here if out of sync). // - Picture Id when available. // - Sequence numbers. // Return true when in initial state. // Note that when a method is not applicable it will return false. assert(frame != NULL); // A key frame is always considered continuous as it doesn't refer to any // frames and therefore won't introduce any errors even if prior frames are // missing. if (frame->FrameType() == VideoFrameType::kVideoFrameKey && HaveSpsAndPps(frame->GetNaluInfos())) { return true; } // When in the initial state we always require a key frame to start decoding. if (in_initial_state_) return false; if (ContinuousLayer(frame->TemporalId(), frame->Tl0PicId())) return true; // tl0picId is either not used, or should remain unchanged. if (frame->Tl0PicId() != tl0_pic_id_) return false; // Base layers are not continuous or temporal layers are inactive. // In the presence of temporal layers, check for Picture ID/sequence number // continuity if sync can be restored by this frame. if (!full_sync_ && !frame->LayerSync()) return false; if (UsingPictureId(frame)) { if (UsingFlexibleMode(frame)) { return ContinuousFrameRefs(frame); } else { return ContinuousPictureId(frame->PictureId()); } } else { return ContinuousSeqNum(static_cast<uint16_t>(frame->GetLowSeqNum())) && HaveSpsAndPps(frame->GetNaluInfos()); } } ~~~ 對H264的nalu，pic\_id值為kNoPictureId，Tl0picId的值為kNoTl0PicIdx，TemporalId的值為kNoTemporaId。所以對pictureid或temporalid的判斷，都是可以忽略。那么對H264的執行邏輯是這段語句 ~~~ return ContinuousSeqNum(static_cast<uint16_t>(frame->GetLowSeqNum())) && HaveSpsAndPps(frame->GetNaluInfos()); ~~~ 是通過seqnum，是否有sps，pps來判斷幀間的解碼連續性。 **如果兩個nalu是連續的則后一個的nalu的中最小的seqnum是等于前一個nalu中最大的seqnum加1的，成員函數ContinuousSeqNum就是這個判斷邏輯。** ### 成員函數HaveSpsAndPps 它做了兩件事: 1. 判斷nalu是否是同一個GOP 2. 判斷GOP中是否有SPS和PPS ~~~ bool VCMDecodingState::HaveSpsAndPps(const std::vector<NaluInfo>& nalus) const { std::set<int> new_sps; std::map<int, int> new_pps; for (const NaluInfo& nalu : nalus) { // Check if this nalu actually contains sps/pps information or dependencies. if (nalu.sps_id == -1 && nalu.pps_id == -1) continue; switch (nalu.type) { case H264::NaluType::kPps: if (nalu.pps_id < 0) { RTC_LOG(LS_WARNING) << "Received pps without pps id."; } else if (nalu.sps_id < 0) { RTC_LOG(LS_WARNING) << "Received pps without sps id."; } else { new_pps[nalu.pps_id] = nalu.sps_id; } break; case H264::NaluType::kSps: if (nalu.sps_id < 0) { RTC_LOG(LS_WARNING) << "Received sps without sps id."; } else { new_sps.insert(nalu.sps_id); } break; default: { int needed_sps = -1; auto pps_it = new_pps.find(nalu.pps_id); if (pps_it != new_pps.end()) { needed_sps = pps_it->second; } else { auto pps_it2 = received_pps_.find(nalu.pps_id); if (pps_it2 == received_pps_.end()) { return false; } needed_sps = pps_it2->second; } if (new_sps.find(needed_sps) == new_sps.end() && received_sps_.find(needed_sps) == received_sps_.end()) { return false; } break; } } } return true; } ~~~ 是否是同一個GOP的判斷是根據sps\_id和pps\_id： 1. **pps\_id為 pic\_parameter\_set\_id**，表示當前pps的id，某個pps在碼流中會被相應的slice引用。slice引用pps的方式就是在slice header中保存pps的 id。 2. **sps\_id為 seq\_parameter\_set\_id**，表示當前sps的id。被pps引用，在pps中帶有所引用的sps的id。 **那么在一個GOP內的nalu，各slice中pps id應該是相同的。pps中的sps id與sps中的 id是相同的。如果兩個nalu的seqnum是連續的，且屬于同一個GOP，且存在SPS，PPS，則認為幀間是可連續解碼的。** ### VCMJitterBuffer中對nalu是否可連續解碼的處理知道了H264判斷nalu間是否可連續解碼的依據，再回過頭來看看VMCJitterBuffer的**InsertPacket**方法關于nalu間是否可連續解碼的邏輯，涉及到三個成員函數：**FindAndInsertContinuousFramesWithState，FindAndInsertContinuousFrames，IsContinuous** * **FindAndInsertContinuousFramesWithState**成員函數，它的作用就是根據最近一次可解碼nalu的信息(記錄在VCMDecodingState中)在incomplete framelist中尋找同屬一個GOP內的nalu。從incomplete framelis中刪除，插入到decodable framelist中 ~~~ void VCMJitterBuffer::FindAndInsertContinuousFramesWithState( const VCMDecodingState& original_decoded_state) {//尋找同一個GOP內的Nalu // Copy original_decoded_state so we can move the state forward with each // decodable frame we find. VCMDecodingState decoding_state; decoding_state.CopyFrom(original_decoded_state); // When temporal layers are available, we search for a complete or decodable // frame until we hit one of the following: // 1. Continuous base or sync layer. // 2. The end of the list was reached. //對H264可以忽略temporal的處理邏輯 for (FrameList::iterator it = incomplete_frames_.begin();it != incomplete_frames_.end();) { VCMFrameBuffer* frame = it->second; if (IsNewerTimestamp(original_decoded_state.time_stamp(),frame->Timestamp())) { ++it; continue; } if (IsContinuousInState(*frame, decoding_state)) { decodable_frames_.InsertFrame(frame); incomplete_frames_.erase(it++); decoding_state.SetState(frame); } else if (frame->TemporalId() <= 0) { break; } else { ++it; } } } ~~~ * 成員函數**FindAndInsertContinuousFrames**，是通過一個nalu在incomplete framelist中尋找同屬一個GOP內的nalu ~~~ void VCMJitterBuffer::FindAndInsertContinuousFrames( const VCMFrameBuffer& new_frame) { VCMDecodingState decoding_state; decoding_state.CopyFrom(last_decoded_state_); decoding_state.SetState(&new_frame); FindAndInsertContinuousFramesWithState(decoding_state); } ~~~ * 成員函數**IsContinuous**是用于判斷nalu是否可以連續解碼 ~~~ bool VCMJitterBuffer::IsContinuous(const VCMFrameBuffer& frame) const { if (IsContinuousInState(frame, last_decoded_state_)) {//與last_decoded_state_代表的上一個nalu是可連續解碼的 return true; } //還有一種情況：該frame與last_decoded_state_代表的nalu是在seqnum上是不連續， //但是屬于同一個GOP內的，所以要遍歷decodable framelist進行判斷 VCMDecodingState decoding_state; decoding_state.CopyFrom(last_decoded_state_); for (FrameList::const_iterator it = decodable_frames_.begin(); it != decodable_frames_.end(); ++it) { VCMFrameBuffer* decodable_frame = it->second; if (IsNewerTimestamp(decodable_frame->Timestamp(), frame.Timestamp())) { break; } decoding_state.SetState(decodable_frame); if (IsContinuousInState(frame, decoding_state)) { return true; } } return false; } ~~~ 判斷nalu是否可連續解碼，需要考慮兩種情況： 1. 該nalu與last\_decoded\_state\_代表的上一個nalu在同一個GOP內，且seqnum是連續的。 2. 屬于同一個GOP，但是seqnum不連續，此時應該去遍歷decodable framelist，尋找在同一個GOP內，seqnum連續的nalu。對VCMJitterBuffer的插入操作，就時涉及到對rtp包的處理和對nalu，GOP的處理。也通過這兩篇文章講的比較清楚了。后面將會關注去nalu的處理。