New research project: Understanding and Reporting on IPTV Behaviour

In our previous project work we a) established an active measurement infrastructure to assess the communication characteristics of cross-provider IPTV paths with different kinds of access networks (cable, ADSL) and carried out a first series of experiments, b) realising an ns-2 simulation environment for IPTV systems (including VQEs) incorporating RTCP- based feedback, FEC, and retransmissions for SSM-based video distribution, and c) performing initial simulations for feedback-based error repair and inference using the characteristics determined by the measurement.

We have been able to gain some initial insights. Specifically, loss due to a single access link is random and uncorrelated (until the traffic hits the link capacity) at the timescales observed with moderate rate IPTV probes, but is subject to rapid fluctuations due to cross traffic on the link, and slower repetitive fluctuations per day and across the week due to background traffic further within the network. Statistical characteristics of end-to-end delay and jitter are predictable over long timescales, but vary rapidly on short timescales (partly due to cross traffic, partly due to link scheduling behaviour). Significant differences in behaviour are noted across providers and access link types. And we find that in a system with limited scale per VQE, simple reactive repair mechanisms appear to be still sufficient.

Yet, expansions of both measurements and simulations are needed to get a more representative picture of path characteristics, to isolate those of the individual network segments, and to investigate a broader range of error repair and inference mechanisms based upon this. And we find that today's IPTV deployments do not use RTP/RTCP in many cases, rendering solely RTCP-based repair and analysis tools useless. This gives rise to three new research tasks.

Measurement, Modelling and Behaviour Inference. Our current work has developed a monitoring platform that can be deployed to residential users to measure wide-area IPTV performance. We will expand that monitoring infrastructure to collect more detailed IPTV performance measurements from users in a wider range of environments. That expansion will be in two directions: to extend the monitoring to cover a wider range of homes, using several different Internet service providers; and to enhance the monitoring tools to probe hop-by-hop behaviour of the path, not just the end-to-end behaviour. The resulting traces will be made available to the research community.

The expansion of our measurement infrastructure to a wider range of ISPs is straightforward, but essential to give confidence in the validity of our results, ensuring that we do not simply measure the quirks of a single ISP’s network. We will also extend the level of detail by monitoring the path hop-by-hop, using a traceroute-like idea, to distinguish last-hop behaviour from the backbone. This should not be difficult to realise for our measurements: since we are using synthetic traffic, and so are not constrained to produce viewable video at the receiver, we have the flexibility to probe internal network characteristics (in a commercial IPTV deployment, infrastructure nodes would be used to obtain this information).

Measurements such as these provide an important complement to measurements of deployed IPTV systems. We introduce heterogeneity of practice into the measurements by studying a range of ISPs, operating a range of networks, in somewhat different economic and geographic conditions (the UK and Finland); this helps reflect the range of environments in which IPTV will be deployed. We consider the full path, hop-by-hop, using synthetic traffic to allow us to gain insights into the causes of common behaviours; and we consider the inter-domain path, giving insight into the behaviour of future systems that may operate outside the walled garden of a single ISP.

We will build on the findings from the measurements to derive algorithms to infer the causes of system behaviour and establish the bigger picture of the losses. This may provide — in real-time — input to the error correction and recovery processes discussed below. Inference mechanisms will build upon distinguishing congestive and non-congestive losses by considering the delays and delay variations when losses occur, the distinction of different segments of the data paths, and the underlying loss model derived from the measurements (and the fluctuations over time). The network path will thus be modelled as the sum of at least three behaviours: 1) the core network (including the edge ISP, upstream of the DSLAM; typically slow changes, mostly due to time-of-day load variation); 2) edge traffic (other traffic sharing the last-hop ADSL link; mostly rapid changes due to TCP dynamics); and 3) background effects of noise on the last-hop ADSL link. (We will consider expanding the model as new effects emerge.) While any such model clearly abstracts away some details, we should be able to measure and isolate these effects, and estimate their contribution to the path characteristics depending on the time of day, etc. The ability to probe the last-hop link as discussed above will aid this estimation. The model output can be shared (dynamically) with infrastructure nodes close to the endpoint (e.g., for error repair), or even with the endpoints themselves (this may involve RTCP extensions to report model parameters). Smart nodes should be able to determine whether they are receiving the expected traffic for their environment, based on the model, and react accordingly. This endpoint or access link-specific reporting will later be expanded to operate across endpoints.

Error Correction and Recovery. We will continue investigations of error repair, adapting the repair function based on measured network conditions. In addition to the “instantw” loss metrics of an endpoint or a group of end-points to choose appropriate repair mechanisms, we will apply the inference results to improve system performance, giving the repair algorithms an understanding of the expected system behaviour, and some hints on why the current behaviour deviates from that expectation. We will investigate repair algorithms across the range of path characteristics obtained from the measurements, across heterogeneous access networks, across a broad scale with respect to the number of receiver systems, and across channels. An important part of this will involve modelling the use of a common repair stream for multiple channels, to understand the issues arising (both in terms of signalling protocols, and integration across RTP/RTCP sessions) when repair techniques (i.e. network coding) are applied across multiple channels. We will investigate the applicability of such repair channels with different numbers of channels and receiver distributions and consider different parts of the topology (e.g. contribution vs. core vs. access network) and, specifically for the latter, the implications of different access technologies (e.g., the broadcast-style cable vs. the point-to-point DSL networks). We expect that mid-term observation of the network at large can help to fine-tune pro-active cross-stream repair strategies to minimise redundancy, reporting, and repair overhead.

Reporting Frameworks and Virtual RTCP. The above mechanisms will be suitable for media streams offering RTCP-based feedback from the endpoints and possibly intermediary devices, taking advantage of the monitoring and reporting capabilities of RTP. To provide similar services also for non-RTP streams, we will design a proxy RTCP. This refers to non-RTP media streams analysed by intermediaries or endpoints to a) detect real-time flows, and b) assess them and generate RTCP reports on their behalf so that the monitoring, inference, and repair infrastructure can continue its operation as with RTP streams. Proxy RTCP will be realised either in the endpoints or inside the network to deal with legacy boxes.

Proxy RTCP may or may not be able use knowledge of the payload carried in, e.g., plain UDP packets. A set of header formats (e.g., MPEG) and traffic pattern “signatures” may be provided as input, but the system should ideally be more general. Without any cues from known RTP headers, stream identification will use size and timing characteristics of individual flows taking advantage of repetitive patterns over time; the latter will also contribute to loss detection. Packet identification (e.g., for repair purposes and cross receiver correlation) may revert to hashes or similar in the absence of any other identifying features. Means for flow identification, which needs to happen in real-time, may include Fourier or wavelet transforms as well as simpler approximations, e.g., using heuristics. One interesting research aspect is the trade-off between accuracy and performance when performing a simultaneous analysis for many streams on a core network link.

This project is a collaboration between the Helsinki University of Technology and the University of Glasgow. Funding is provided by Cisco Systems.