1. 程式人生 > >如何實現音視訊同步 (live555)

如何實現音視訊同步 (live555)

live555中視訊和音訊是分別進行編碼的,如何實現兩者的同步呢?
如果可以做到讓視訊和音訊的時間戳,都與NTP時間保持同步,就可達到音視訊同步的目的。

Network Time Protocol (NTP) is a networking protocol for clock synchronizationbetween computer systems overpacket-switched, variable-latency data networks.

在live555中是如何實現這種機制的呢?
總體思路是:
把A/V的RTP時間戳同步到RTCP的絕對時間(NTP Timestamp),實現A/V同步

首先看一下未加入同步機制的時間戳程式碼:

void RTPReceptionStats::noteIncomingPacket(u_int16_t seqNum, 
                                           u_int32_t rtpTimestamp,
                                           unsigned timestampFrequency,
                                           Boolean useForJitterCalculation,
                                           struct
timeval& resultPresentationTime, Boolean& resultHasBeenSyncedUsingRTCP, unsigned packetSize) { ... // Record the inter-packet delay struct timeval timeNow; gettimeofday(&timeNow, NULL); ... // Return the 'presentation time' that corresponds to "rtpTimestamp":
if (fSyncTime.tv_sec == 0 && fSyncTime.tv_usec == 0) { // This is the first timestamp that we've seen, so use the current // 'wall clock' time as the synchronization time. (This will be // corrected later when we receive RTCP SRs.) fSyncTimestamp = rtpTimestamp; // 首個RTP Timestamp fSyncTime = timeNow; // 使用當前系統時間作為初始參考時間戳 } int timestampDiff = rtpTimestamp - fSyncTimestamp; // Note: This works even if the timestamp wraps around // (as long as "int" is 32 bits) // Divide this by the timestamp frequency to get real time: double timeDiff = timestampDiff/(double)timestampFrequency; // Add this to the 'sync time' to get our result: unsigned const million = 1000000; unsigned seconds, uSeconds; if (timeDiff >= 0.0) { // 計算時間戳 seconds = fSyncTime.tv_sec + (unsigned)(timeDiff); uSeconds = fSyncTime.tv_usec + (unsigned)((timeDiff - (unsigned)timeDiff)*million); if (uSeconds >= million) { uSeconds -= million; ++seconds; } } else { timeDiff = -timeDiff; seconds = fSyncTime.tv_sec - (unsigned)(timeDiff); uSeconds = fSyncTime.tv_usec - (unsigned)((timeDiff - (unsigned)timeDiff)*million); if ((int)uSeconds < 0) { uSeconds += million; --seconds; } } resultPresentationTime.tv_sec = seconds; resultPresentationTime.tv_usec = uSeconds; resultHasBeenSyncedUsingRTCP = fHasBeenSynchronized; // Save these as the new synchronization timestamp & time: fSyncTimestamp = rtpTimestamp; fSyncTime = resultPresentationTime; fPreviousPacketRTPTimestamp = rtpTimestamp; }

其中有兩個重要的引數: fSyncTimestampfSyncTime;

class RTPReceptionStats {
...

private:
  // Used to convert from RTP timestamp to 'wall clock' time:
  Boolean fHasBeenSynchronized;
  u_int32_t fSyncTimestamp;
  struct timeval fSyncTime;
};
  • fSyncTimestamp
    RTP Timestamp, 預設第N幀的rtpTimestamp為第N+1幀的fSyncTimestamp
  • fSyncTime
    'wall clock' time, 預設第N幀的'wall clock' time為第N+1幀的fSyncTime

RTPReceptionStats::noteIncomingPacket的實質是:
將 RTP timestamp 轉換為 'wall clock' time

獲取首個RTP時,將系統時間作為首個'wall clock' time
後續,當RTP timestamp發生變化時,要將變化的部分轉換為real time:

int timestampDiff = rtpTimestamp - fSyncTimestamp;
 // Divide this by the timestamp frequency to get real time: 
double timeDiff = timestampDiff/(double)timestampFrequency;

然後將該部分改變反映到'wall clock' time上, 如:

seconds = fSyncTime.tv_sec + (unsigned)(timeDiff); 
uSeconds = fSyncTime.tv_usec + (unsigned)((timeDiff - (unsigned)timeDiff)*million);

可以看出以上的邏輯中,完全取決於系統時間的精確度,沒有任何校正機制。

live555是在哪裡實現時間校正的呢?
答案是利用RTSP Server利用RTCP返回的Sender Report, 然後利用其中的NTP TimestampRTP timestamp, 對fSyncTimestampfSyncTime進行校正。


Part of Sender Report RTCP Packet

校正程式如下:

void RTPReceptionStats::noteIncomingSR(u_int32_t ntpTimestampMSW,
                                       u_int32_t ntpTimestampLSW,
                                       u_int32_t rtpTimestamp) 
{
    fLastReceivedSR_NTPmsw = ntpTimestampMSW;
    fLastReceivedSR_NTPlsw = ntpTimestampLSW;

    gettimeofday(&fLastReceivedSR_time, NULL);

    // Use this SR to update time synchronization information:
    // ntpTimestampMSW : NTP timestamp, most significant word (64位NTP時間戳的高32位)
    fSyncTimestamp      = rtpTimestamp;
    fSyncTime.tv_sec    = ntpTimestampMSW - 0x83AA7E80; // 1/1/1900 -> 1/1/1970

    // ntpTimestampLSW  : NTP timestamp, least significant word (64位NTP時間戳的低32位)
    double microseconds = (ntpTimestampLSW * 15625.0) / 0x04000000; // 10^6/2^32
    fSyncTime.tv_usec   = (unsigned)(microseconds + 0.5);
}

通過Sender Report,分別對視訊和音訊的時間及時進行校正,即可保證視音訊同步。