TU Darmstadt / ULB / TUbiblio

Evolutionary training and abstraction yields algorithmic generalization of neural computers

Tanneberg, Daniel ; Rueckert, Elmar ; Peters, Jan (2023)
Evolutionary training and abstraction yields algorithmic generalization of neural computers.
In: Nature Machine Intelligence, 2020, 2 (12)
doi: 10.26083/tuprints-00020535
Artikel, Zweitveröffentlichung, Postprint

Kurzbeschreibung (Abstract)

A key feature of intelligent behaviour is the ability to learn abstract strategies that scale and transfer to unfamiliar problems. An abstract strategy solves every sample from a problem class, no matter its representation or complexity—similar to algorithms in computer science. Neural networks are powerful models for processing sensory data, discovering hidden patterns and learning complex functions, but they struggle to learn such iterative, sequential or hierarchical algorithmic strategies. Extending neural networks with external memories has increased their capacities to learn such strategies, but they are still prone to data variations, struggle to learn scalable and transferable solutions, and require massive training data. We present the neural Harvard computer, a memory-augmented network-based architecture that employs abstraction by decoupling algorithmic operations from data manipulations, realized by splitting the information flow and separated modules. This abstraction mechanism and evolutionary training enable the learning of robust and scalable algorithmic solutions. On a diverse set of 11 algorithms with varying complexities, we show that the neural Harvard computer reliably learns algorithmic solutions with strong generalization and abstraction, achieves perfect generalization and scaling to arbitrary task configurations and complexities far beyond seen during training, and independence of the data representation and the task domain.

Typ des Eintrags: Artikel
Erschienen: 2023
Autor(en): Tanneberg, Daniel ; Rueckert, Elmar ; Peters, Jan
Art des Eintrags: Zweitveröffentlichung
Titel: Evolutionary training and abstraction yields algorithmic generalization of neural computers
Sprache: Englisch
Publikationsjahr: 17 Oktober 2023
Ort: Darmstadt
Publikationsdatum der Erstveröffentlichung: 16 November 2020
Ort der Erstveröffentlichung: London
Verlag: Springer
Titel der Zeitschrift, Zeitung oder Schriftenreihe: Nature Machine Intelligence
Jahrgang/Volume einer Zeitschrift: 2
(Heft-)Nummer: 12
Kollation: 14, v Seiten
DOI: 10.26083/tuprints-00020535
URL / URN: https://tuprints.ulb.tu-darmstadt.de/20535
Zugehörige Links:
Herkunft: Zweitveröffentlichungsservice
Kurzbeschreibung (Abstract):

A key feature of intelligent behaviour is the ability to learn abstract strategies that scale and transfer to unfamiliar problems. An abstract strategy solves every sample from a problem class, no matter its representation or complexity—similar to algorithms in computer science. Neural networks are powerful models for processing sensory data, discovering hidden patterns and learning complex functions, but they struggle to learn such iterative, sequential or hierarchical algorithmic strategies. Extending neural networks with external memories has increased their capacities to learn such strategies, but they are still prone to data variations, struggle to learn scalable and transferable solutions, and require massive training data. We present the neural Harvard computer, a memory-augmented network-based architecture that employs abstraction by decoupling algorithmic operations from data manipulations, realized by splitting the information flow and separated modules. This abstraction mechanism and evolutionary training enable the learning of robust and scalable algorithmic solutions. On a diverse set of 11 algorithms with varying complexities, we show that the neural Harvard computer reliably learns algorithmic solutions with strong generalization and abstraction, achieves perfect generalization and scaling to arbitrary task configurations and complexities far beyond seen during training, and independence of the data representation and the task domain.

Status: Postprint
URN: urn:nbn:de:tuda-tuprints-205359
Sachgruppe der Dewey Dezimalklassifikatin (DDC): 000 Allgemeines, Informatik, Informationswissenschaft > 004 Informatik
Fachbereich(e)/-gebiet(e): 20 Fachbereich Informatik
20 Fachbereich Informatik > Intelligente Autonome Systeme
TU-Projekte: EC/H2020|640554|SKILLS4ROBOTS
Hinterlegungsdatum: 17 Okt 2023 11:31
Letzte Änderung: 18 Okt 2023 08:07
PPN:
Zugehörige Links:
Export:
Suche nach Titel in: TUfind oder in Google
Frage zum Eintrag Frage zum Eintrag

Optionen (nur für Redakteure)
Redaktionelle Details anzeigen Redaktionelle Details anzeigen