Schramowski, Patrick (2023)
Self-Supervised Learning of Machine Ethics.
Technische Universität Darmstadt
doi: 10.26083/tuprints-00023090
Dissertation, Erstveröffentlichung, Verlagsversion
Kurzbeschreibung (Abstract)
In recent years Artificial Intelligence (AI), especially deep learning, has proven to be a technology driver in industry. However, while advancing existing and creating novel technologies, automatizing processes, and assisting humans in essential areas such as drug discovery, they raise many concerns, like other groundbreaking novel technologies before. In this case, these concerns include, for instance, models producing stereotypical and derogatory content as well as gender and racial biases. Since AI technologies will permeate more of our lives in the coming years, these concerns need to be addressed. This thesis examines recent data-driven approaches, which often suffer from degenerated and biased behavior through their self-supervised training on large-scale noisy web data, containing potential inappropriate data. While this is well-established, we will investigate and demonstrate the promises of deep models’ acquired knowledge and capabilities through the provision of this very particular potentially inappropriate data. Importantly, we present the first approaches for learning ethics from data. Our findings suggest that if we build an AI system that learns an improved representation of data and that is able to better understand and produce it, in the process, it will also acquire more accurate societal knowledge, in this case, historical cultural associations to make human-like "right" and "wrong" choices. Furthermore, based on these findings, we consequently ask the arguably "circular" question of whether a machine can help us mitigate their associated concerns. Importantly, we demonstrate the importance of their ability to distinguish between "right" and "wrong" and how utilizing them can mitigate associated risks surrounding large-scale models themselves. However, we also highlight the role of human-machine interaction to explore and reinforce AI systems’ properties, including their flaws and merits, and present how human feedback on explanations can align deep learning based models with our precepts. We present these algorithms and corresponding findings, providing important insights for the goal of putting human values into AI systems, which, summarized, may not be insurmountable in the long run.
Typ des Eintrags: | Dissertation | ||||
---|---|---|---|---|---|
Erschienen: | 2023 | ||||
Autor(en): | Schramowski, Patrick | ||||
Art des Eintrags: | Erstveröffentlichung | ||||
Titel: | Self-Supervised Learning of Machine Ethics | ||||
Sprache: | Englisch | ||||
Referenten: | Kersting, Prof. Dr. Kristian ; Fraser, Prof. Dr. Alexander M. | ||||
Publikationsjahr: | 2023 | ||||
Ort: | Darmstadt | ||||
Kollation: | xxi, 208 Seiten | ||||
Datum der mündlichen Prüfung: | 20 März 2023 | ||||
DOI: | 10.26083/tuprints-00023090 | ||||
URL / URN: | https://tuprints.ulb.tu-darmstadt.de/23090 | ||||
Kurzbeschreibung (Abstract): | In recent years Artificial Intelligence (AI), especially deep learning, has proven to be a technology driver in industry. However, while advancing existing and creating novel technologies, automatizing processes, and assisting humans in essential areas such as drug discovery, they raise many concerns, like other groundbreaking novel technologies before. In this case, these concerns include, for instance, models producing stereotypical and derogatory content as well as gender and racial biases. Since AI technologies will permeate more of our lives in the coming years, these concerns need to be addressed. This thesis examines recent data-driven approaches, which often suffer from degenerated and biased behavior through their self-supervised training on large-scale noisy web data, containing potential inappropriate data. While this is well-established, we will investigate and demonstrate the promises of deep models’ acquired knowledge and capabilities through the provision of this very particular potentially inappropriate data. Importantly, we present the first approaches for learning ethics from data. Our findings suggest that if we build an AI system that learns an improved representation of data and that is able to better understand and produce it, in the process, it will also acquire more accurate societal knowledge, in this case, historical cultural associations to make human-like "right" and "wrong" choices. Furthermore, based on these findings, we consequently ask the arguably "circular" question of whether a machine can help us mitigate their associated concerns. Importantly, we demonstrate the importance of their ability to distinguish between "right" and "wrong" and how utilizing them can mitigate associated risks surrounding large-scale models themselves. However, we also highlight the role of human-machine interaction to explore and reinforce AI systems’ properties, including their flaws and merits, and present how human feedback on explanations can align deep learning based models with our precepts. We present these algorithms and corresponding findings, providing important insights for the goal of putting human values into AI systems, which, summarized, may not be insurmountable in the long run. |
||||
Alternatives oder übersetztes Abstract: |
|
||||
Status: | Verlagsversion | ||||
URN: | urn:nbn:de:tuda-tuprints-230900 | ||||
Sachgruppe der Dewey Dezimalklassifikatin (DDC): | 000 Allgemeines, Informatik, Informationswissenschaft > 004 Informatik | ||||
Fachbereich(e)/-gebiet(e): | 20 Fachbereich Informatik 20 Fachbereich Informatik > Künstliche Intelligenz und Maschinelles Lernen |
||||
Hinterlegungsdatum: | 24 Mai 2023 12:11 | ||||
Letzte Änderung: | 06 Jun 2023 09:09 | ||||
PPN: | |||||
Referenten: | Kersting, Prof. Dr. Kristian ; Fraser, Prof. Dr. Alexander M. | ||||
Datum der mündlichen Prüfung / Verteidigung / mdl. Prüfung: | 20 März 2023 | ||||
Export: | |||||
Suche nach Titel in: | TUfind oder in Google |
Frage zum Eintrag |
Optionen (nur für Redakteure)
Redaktionelle Details anzeigen |