International Symposium on System-on-Chip
SoC | 1999 | 2000 | 2001 | 2002 | 2003 | 2004 | 2005 | 2006 | 2007 | 2008 | 2009 | 2010 | 2011 | 2012 | 2013 | 2014 | 2015

General

Conference

Lodging / Travel


Valid XHTML 1.1

Valid CSS!

Global data storage and transfer exploration in (parallel) processor architectures for multi-media applications.

Francky Catthoor, IMEC

Many current applications are data-dominated. This happens in embedded multi-media applications like image, video, speech and audio processing but also in broadband wired or wireless telecom terminals and even modern network protocols. For these application domains, one of the most crucial problems in the design trajectory is related to data storage access and global data communication for complex data types like multi-dimensional arrays or sets of labeled data.

A comprehensive design exploration script has been developed to deal with this in the past for customized processor architectures. Models and automatable optimisation techniques have been investigated and developed to support the memory organisation tasks. They have a large impact on both (board) cost and power consumption. Practical examples have demonstrated potential savings in the range of an order of magnitude rather than a few dozen %. This relies especially on system-level transformations of the initial algorithmic specification. In the last 5 years, also extensions towards data storage and transfer related issues in (partly) predefined signal and video processor architectures have been studied. The focus of this talk will be mainly on these issues, including global data-flow and control-flow (mainly loop) transformations, task versus data parallelism exploitation and agressive cache oriented code optimisations. Experiments on real-life multi-media applications show a speed-up of up to a factor 3 and a typical reduction in memory bus load (on-chip and off-chip) with an order of magnitude. The result is also a significant energy consumption reduction.


Reconfigurable Computing Architectures and Methodologies for System-on-Chip

Reiner W. Hartenstein, University of Kaiserslautern

Making gate arrays obsolete, FPGAs are successfully proceeding from niche to mainstream. Like microprocessor usage, FPGA programming is RAM-based, but by structural programming (also called "(re)configuration") instead of procedural programming. Now both, host and accelerator are RAM-based and as such also available on the same chip: ready for SoC design.

Now also accelerator definition may be -at least partly- conveyed from vendor site to customer site, such as e. g. for upgrades. A new business model is needed. But this paradigm switch is still ignored: FPGAs do not repeat the RAM-based success story of the software industry. There is not yet a configware industry, since mapping applications onto FPGAs mainly uses hardware synthesis methods, but not really compilation. Supporting only fine-grained reconfigurability of roughly single bit wide configurable logic blocks (CLBs) the mapping tools are mainly based on gate level methods - similar to CAD for hardwired logic.

From a decade of world-wide research on Reconfigurable Computing another breed of reconfigurable platforms is an emerging competitor to FPGAs. In contrast to FPGAs, the Reconfigurable Computing scene uses arrays of c oarse-grained reconfigurable datapath units (rDPUs) with drastically reduced reconfigurability overhead: to directly configure high level parallelism. But the classical machine paradigm does not support soft datapaths because ?instruction fetch? is not done at run time.

To introduce the new business model to cope with the current accelerator design crisis a transition from CAD to compilation is needed, and from hardware/software co-design to configware/software co-compilation. The paper illustrates such a roadmap to reconfigurable computing, supporting the emerging trend to platform-based SoC design.


Enabling Technologies for Model Year Upgrades for Mobile Phones

Mobile phones and base stations depend on microprocessors, DSPs and ASICs for their realization. Design objectives, however, are not just high performance, but include compact code, low power, and a rapid-time-market at a low cost of hardware/software integration of the SoC.

Algorithms needed, change frequently, with new features focussing on multimedia services and data networking in addition to voice transmission. DSP processors are changing every few months, and faster, cheaper, and better DSP chips are being produced by an increasingly larger number of vendors.

However, code development for most DSP applications is still based on manual-code optimization, a process that is slow and ineffective for complex applications. The code that results is processor specific and represents a high value but low quality investment. With newer processors and architectures (including those that reconfigure), this problem is likely to get worse, with a negative impact on cost, quality, and time to market.

We propose the development of "model-year phones", utilizing a powerful methodology, SoftwareFirst. The underlying technologies include advanced retargetable compilers for DSPs, DSP code, or "DSPBean", migrators, and virtual prototyping for hardware/software codesign and verification.

Benefits of the proposed "model year phone methodology" include "near manual" DSP code quality with orders of magnitude time to market improvements, ability to move code from one processor to another quickly, and the ability to partition the functionality across hardware and software in a manner that suits the technical and business objectives for the mobile phone. This attention to DSP software and its careful design to ensure its effectiveness inspite of underlying platform changes, in addition to the capability to add new features economically and in a manner that reduces the number of errors in system integration, is expected to help mobile phone technology vendors retain margins and a competitive advantage in a climate that is favoring commoditization of the hardware side of the phone business.

The talk will rely on results from recent benchmarking efforts.


Low-Power DSP Components for Multimedia Communications

Scaled CMOS technologies and growing broadband applications in wired, wireless and fiber media have created many challenges as well as opportunities in communications systems implementations. Digital signal processing technology is an integral part of all communications systems. Reduction of power consumption is one of the major challenges in design of these systems or chips containing 100-500 M transistors.

This talk will address estimation and reduction of power consumption in DSP system implementations. We will begin by describing an approach to estimate power consumption which has been incorporated in the HEAT tool. Then low-power design of arithmetic building blocks (such as binary adders and Montgomery multipliers used in RSA cryptosystems), digital signal processing building blocks (such as digital filters, DCTs, Huffman coders), and error control coding building blocks (such as Turbo coders, low-density parity check codes and Reed-Solomon coders) will be addressed. Use of adaptive filter equalization in optical transmission will be described. At the end, Low-power CAD approaches will be described where power consumption can be reduced by near-optimal scheduling of multiple supply voltages and multiple technology threshold voltages.


The Systolic Ring : A Dynamically Reconfigurable Architecture for SOC and Embedded Systems

Internet is becoming one of the key features of tomorrow's communication world. The evolution of mobile phones network, such as UMTS will soon allow everyone to be connected, everywhere. This new network technologies bring the ability to deal not only with classical voice or text messages, but also with improved content: multimedia. At the mobile level, this kind of data oriented content requires highly efficient architectures; and nowadays mobile system-on-chip solution will no longer be able to manage the critical constraints like area, power, and data computing efficiency. In this talk we will propose a new dynamically reconfigurable network (Systolic Ring), dedicated to data oriented applications such as the one targeted on third generation networks. Principles, realisations and comparative results will be exposed for some classical applications, targeted on different architectures.


Galois Field Instruction Set Accelerator in the StarCore SC140 DSP

The capacity of communication channels is improved by forward error correction schemes. In ADSL, the Reed Solomon algorithm is chosen for error detection and correction. It belongs to the family of linear block codes and is based on arithmetics over Galois fields.

Implementing Galois arithmetics on the StarCore SC140 DSP using conventional arithmetics and look-up tables, renders the Reed Solomon algorithm as the bottleneck of the ADSL application.

The solution is to perform Galois field operations on a dedicated hardware element. We chose an instruction set accelerator (ISA). The ISA plug-in module provides a means of enhancing the SC140 instruction set in accordance with the core program flow, using the core control system. This type of acceleration is mainly suitable for enhancing special arithmetic algorithms that their processing is merged with the core?s flow. Thus the overhead of the program control of the module is eliminated without any penalty of its instruction set size. 24 bits of encoding space are dedicated to the ISA instruction set, thus practically there is no limit of the number of operations that can be executed in parallel to the core?s instructions. In addition, there is no need for interrupt based communication protocol between the accelerator and the core, as common in coprocessors.

Galois field algebra is an ideal candidate for this type of accelerator by being a special arithmetic with the need for special operations. About 50 different instructions were defined in the GFISA thus the cycle count could be reduced to its minimum.

The problematic parts of the algorithm on the DSP were identified as polynomial evaluation and MAC (multiply-accumulate) operations over Galois fields and thus implemented in a GFISA (Galois field ISA). The cycle count using the GFISA is reduced by a factor of 20 compared to the cycle count of the SC140.
The idea was proven at high-level. The ISA is expected to consume only few additional silicon area and power.


An Integrated Reconfigurable Architecture

Morpho Technologies is developing a novel model for reconfigurable computing systems, targeted at applications with inherent data-parallelism, high regularity, and high throughput requirements. In a wireless terminal these applications might include baseband processing, video compression (discrete cosine transforms, motion estimation), data encryption, and DSP transforms among others.

The approach is based on an advanced DSP architecture that couples a RISC processor with a reconfigurable DSP fabric. This architecture combines flexibility, performance, low power consumption and ease of programmability. The end result is enabling a highly integrated, multi-mode, multimedia-enabled terminal design capability that can flexibly support evolving standards and data services.


Paul Master, QuickSilver Technology

To support the future Wireless Internet, a mobile terminal platform is required whose application and network capabilities can be upgraded on the fly to instantly meet the needs/desires of a changing consumer marketplace and the demand for capacity improvements from the operator.

Adaptive computing technology, as implemented in QuickSilver's Adaptive Computing Machine (ACM), is a key enabler to a flexible application and air interface environment. Embedded directly into a next-generation wireless/mobile device, the ACM adapts its architecture on-the-fly, to create a new and specific hardware engine for each task, thus enabling a single mobile device to perform a variety of functions. By downloading software applications - that include hardware descriptions - from the Internet, a handheld device can upgrade/update its capability to add new features or upgrade the air interface.

The ACM is a pivotal technology for Software Defined Radio, enabling a consumer to roam seamlessly throughout the world, staying "connected" via the same single mobile device.

This presentation will describe the features and benefits of adaptivecomputing. It will also describe the architecture behind the ACM and the tools needed to program it.