Rapid Prototyping and Exploration Environment for Generating C-to-Hardware-Compilers

Stock, Florian-Wolfgang (2019)
Rapid Prototyping and Exploration Environment for Generating C-to-Hardware-Compilers.
Technische Universität Darmstadt
Dissertation, Erstveröffentlichung

URL / URN: https://tuprints.ulb.tu-darmstadt.de/8525

Kurzbeschreibung (Abstract)

There is today an ever-increasing demand for more computational power coupled with a desire to minimize energy requirements. Hardware accelerators currently appear to be the best solution to this problem. While general purpose computation with GPUs seem to be very successful in this area, they perform adequately only in those cases where the data access patterns and utilized algorithms fit the underlying architecture. ASICs on the other hand can yield even better results in terms of performance and energy consumption, but are very inflexible, as they are manufactured with an application specific circuitry. Field Programmable Gate Arrays (FPGAs) represent a combination of approaches: With their application specific hardware they provide high computational power while requiring, for many applications, less energy than a CPU or a GPU. On the other hand they are far more flexible than an ASIC due to their reconfigurability.

The only remaining problem is the programming of the FPGAs, as they are far more difficult to program compared to regular software. To allow common software developers, who have at best very limited knowledge in hardware design, to make use of these devices, tools were developed that take a regular high level language and generate hardware from it.

Among such tools, C-to-HDL compilers are a particularly wide-spread approach. These compilers attempt to translate common C code into a hardware description language from which a datapath is generated. Most of these compilers have many restrictions for the input and differ in their underlying generated micro architecture, their scheduling method, their applied optimizations, their execution model and even their target hardware. Thus, a comparison of a certain aspect alone, like their implemented scheduling method or their generated micro architecture, is almost impossible, as they differ in so many other aspects.

This work provides a survey of the existing C-to-HDL compilers and presents a new approach to evaluating and exploring different micro architectures for dynamic scheduling used by such compilers. From a mathematically formulated rule set the Triad compiler generates a backend for the Scale compiler framework, which then implements a hardware generation backend with described dynamic scheduling.

While more than a factor of four slower than hardware from highly optimized compilers, this environment allows easy comparison and exploration of different rule sets and the micro architecture for the dynamically scheduled datapaths generated from them. For demonstration purposes a rule set modeling the COCOMA token flow model from the COMRADE 2.0 compiler was implemented. Multiple variants of it were explored: Savings of up to 11% of the required hardware resources were possible.

Typ des Eintrags:

Dissertation

Erschienen:

2019

Autor(en):

Stock, Florian-Wolfgang

Art des Eintrags:

Erstveröffentlichung

Titel:

Rapid Prototyping and Exploration Environment for Generating C-to-Hardware-Compilers

Sprache:

Englisch

Referenten:

Koch, Prof. Dr. Andreas ; Hochberger, Prof. Dr. Christian

Publikationsjahr:

März 2019

Ort:

Darmstadt

Datum der mündlichen Prüfung:

19 März 2018

URL / URN:

https://tuprints.ulb.tu-darmstadt.de/8525

Kurzbeschreibung (Abstract):

Alternatives oder übersetztes Abstract:

Alternatives Abstract

Sprache

Heutzutage gibt es eine immer größere Nachfrage nach mehr Rechenleistung, bei gleichzeitigem Wunsch immer weniger Energie dafür aufzuwenden. Momentan sind Hardwarebeschleuniger die beste Lösung hierfür. Während GPUs in diesem Gebiet sehr erfolgreich sind, bringen sie ihre beste Leistung nur zur Geltung, wenn die Algorithmen und Speicherzugriffsmuster auf die zugrundeliegende Architektur abgestimmt sind. Anderseits können ASICs noch mehr Leistung bei noch geringerem Energieverbrauch zur Verfügung stellen, sind aber aufgrund ihrer festgelegten Funktionalität sehr unflexibel. Eine Kombination aus beiden Ansätzen sind FPGAs: Sie können bei hoher Energieeffizienz eine große Rechenleistung zur Verfügung stellen, sind aber gleichzeitig durch ihre Rekonfigurierbarkeit flexibler als ASICs.

Ein offenes Problem ist aber immer noch die Programmierung der FPGAs, da sie viel schwerer zu programmieren sind als herkömmliche Software. Eine mögliche Lösung hierfür sind C-to-HDL Compiler, die herkömmlichen C Code in eine Hardwarebeschreibungssprache übersetzen, um daraus Hardware zu generieren. Viele von diesen Compilern haben Einschränkungen was den unterstützten Sprachumfang angeht, und unterscheiden sich in den verwendeten Optimierungen, der Ablaufplanung, der generierten Mikroarchitektur, ihrem Ausführungsmodell oder der Zielhardware. Diese vielen Unterschiede machen einen Vergleich bezüglich nur eines Aspektes fast unmöglich.

Diese Arbeit bietet eine in die Breite gehende Übersicht über die existierenden C-to-HDL Compiler und stellt ein System vor, das eine schnelle Evaluierung verschiedener Ansätze zur dynamischen Ablaufplanung ermöglicht. Hierzu liest der Compilergenerator Triad einen formalen Satz Regeln ein, aus denen dann ein Compilerbackend für das Compilerframework Scale generiert wird, das C in eine Hardwarebeschreibungsprache übersetzen kann. Die erzeugte Hardware nutzt dabei eine dynamische Ablaufplanung, die durch den formalen Regelsatz definiert wurde.

Während die generierte Hardware mehr als viermal langsamer ist, als die von spezialisierten optimierenden Compilern, erlaubt die vorgestellte Umgebung das schnellere Ausprobieren von verschiedensten Ansätze. Zu Demonstrationszwecken wurde im Regelsatz die Ablaufplanung vom COMRADE 2.0 Compiler nachgebildet. Mit nur wenig Aufwand wurde eine Variante erkundet, welche bei Tests bis zu 11% weniger Hardware Ressourcen benötigt.

Deutsch

URN:

urn:nbn:de:tuda-tuprints-85250

Sachgruppe der Dewey Dezimalklassifikatin (DDC):

000 Allgemeines, Informatik, Informationswissenschaft > 004 Informatik

Fachbereich(e)/-gebiet(e):

20 Fachbereich Informatik
20 Fachbereich Informatik > Eingebettete Systeme und ihre Anwendungen

Hinterlegungsdatum:

17 Mär 2019 20:55

Letzte Änderung:

17 Mär 2019 20:55

PPN: