This thesis considers resource management in the context of parallel multiple video stream decoding, on multicore/many-core platforms. Such platforms have tens or hundreds of on-chip processing elements which are connected via a Network-on-Chip (NoC). Inefficient task allocation configurations can negatively affect the communication cost and resource contention in the platform, leading to predictability and performance issues. Efficient resource management for large-scale complex workloads is considered a challenging research problem; especially when applications such as video streaming and decoding have dynamic and unpredictable workload characteristics. For these type of applications, runtime heuristic-based task mapping techniques are required. As the application and platform size increase, decentralised resource management techniques are more desirable to overcome the reliability and performance bottlenecks in centralised management. ududIn this work, several heuristic-based runtime resource management techniques, targeting real-time video decoding workloads are proposed. Firstly, two admission control approaches are proposed; one fully deterministic and highly predictable; the other is heuristic-based, which balances predictability and performance. Secondly, a pair of runtime task mapping schemes are presented, which make use of limited known application properties, communication cost and blocking-aware heuristics. Combined with the proposed deterministic admission controller, these techniques can provide strict timing guarantees for hard real-time streams whilst improving resource usage. The third contribution in this thesis is a distributed, bio-inspired, low-overhead, task re-allocation technique, which is used to further improve the timeliness and workload distribution of admitted soft real-time streams. ududFinally, this thesis explores parallelisation and resource management issues, surrounding soft real-time video streams that have been encoded using complex encoding tools and modern codecs such as High Efficiency Video Coding (HEVC). Properties of real streams and decoding trace data are analysed, to statistically model and generate synthetic HEVC video decoding workloads. These workloads are shown to have complex and varying task dependency structures and resource requirements. To address these challenges, two novel runtime task clustering and mapping techniques for Tile-parallel HEVC decoding are proposed. These strategies consider the workload communication to computation ratio and stream-specific characteristics to balance predictability improvement and communication energy reduction. Lastly, several task to memory controller port assignment schemes are explored to alleviate performance bottlenecks, resulting from memory traffic contention.
展开▼
机译:本文考虑在多核/多核平台上并行多视频流解码的环境下的资源管理。这样的平台具有数十或数百个片上处理元件,其通过片上网络(NoC)连接。效率低下的任务分配配置可能会对平台中的通信成本和资源争用产生负面影响,从而导致可预测性和性能问题。对于大规模复杂工作负载的有效资源管理被认为是一个具有挑战性的研究问题;特别是当视频流和解码之类的应用程序具有动态且不可预测的工作负载特征时。对于这些类型的应用程序,需要基于运行时启发式的任务映射技术。随着应用程序和平台规模的增加,更加需要分散式资源管理技术来克服集中式管理中的可靠性和性能瓶颈。 ud ud在这项工作中,针对实时视频解码工作负载,提出了几种基于启发式的运行时资源管理技术。首先,提出了两种准入控制方法。一种完全确定性和高度可预测的;另一个是基于启发式的,它平衡了可预测性和性能。其次,提出了一对运行时任务映射方案,这些方案利用了有限的已知应用程序属性,通信成本和可感知阻塞的启发式方法。与建议的确定性准入控制器结合使用时,这些技术可以为硬实时流提供严格的时序保证,同时提高资源利用率。本文的第三点贡献是一种分布式的,受生物启发的,低开销的任务重新分配技术,该技术可用于进一步提高准入的软实时流的及时性和工作量分配。最后,本文探讨了并行化和资源管理问题,围绕使用复杂编码工具和现代编解码器(例如高效视频编码(HEVC))进行编码的软实时视频流进行了探讨。分析实际流和解码跟踪数据的属性,以进行统计建模并生成合成HEVC视频解码工作量。这些工作负载显示具有复杂且变化的任务依赖关系结构和资源需求。为了应对这些挑战,提出了两种新颖的用于图块并行HEVC解码的运行时任务聚类和映射技术。这些策略考虑了工作负载通信与计算比率和特定于流的特性,以平衡可预测性的提高和通信能耗的降低。最后,探索了几种任务到内存控制器的端口分配方案,以缓解由于内存流量争用而导致的性能瓶颈。
展开▼