A device for processing video data comprises: a memory; A receiver configured for real-time transport protocol (RTP) packets; And a first fragmentation unit comprising a subset of fragmented network abstraction layer (NAL) units; Parsing the start bit of the fragmentation unit to determine if the first fragmentation unit comprises the start of the fragmented NAL unit; In response to a first fragmentation unit comprising the start of a fragmented NAL unit and either or both of the transmission modes for the first fragmentation unit being in a multi-session transmission mode and the first parameter being greater than a first value, a fragmented NAL Parsing a second parameter to determine a decoding order for the unit; And one or more processors configured to decode the fragmented NAL unit based on the determined decoding order.
展开▼