Habib, Andrew (2021)
Learning to Find Bugs in Programs and their Documentation.
Technische Universität Darmstadt
doi: 10.26083/tuprints-00017377
Dissertation, Erstveröffentlichung, Verlagsversion
Kurzbeschreibung (Abstract)
Although software is pervasive, almost all programs suffer from bugs and errors. To detect software bugs, developers use various techniques such as static analysis, dynamic analysis, and model checking. However, none of these techniques is bulletproof.
This dissertation argues that learning from programs and their documentation provides an effective means to prevent and detect software bugs. The main observation that motivates our work is that software documentation is often under-utilized by traditional bug detection techniques. Leveraging the documentation together with the program itself, whether its source code or runtime behavior, enables us to build unconventional bug detectors that benefit from the richness of natural language documentation and the formal algorithm of a program. More concretely, we present techniques that utilize the documentation of a program and the program itself to: (i) Improve the documentation by inferring missing information and detecting inconsistencies, and (ii) Find bugs in the source code or runtime behavior of the program. A key insight we build on is that machine learning provides a powerful means to deal with the fuzziness and nuances of natural language in software documentation and that source code is repetitive enough to also allow statistical learning from it. Therefore, several of the techniques proposed in this dissertation employ a learning component whether from documentation, source code, runtime behavior, and their combinations.
We envision the impact of our work to be two-fold. First, we provide developers with novel bug detection techniques that complement traditional ones. Our approaches learn bug detectors end-to-end from data and hence, do not require complex analysis frameworks. Second, we hope that our work will open the door for more research on automatically utilizing natural language in software development. Future work should explore more ideas on how to extract richer information from natural language to automate software engineering tasks, and how to utilize the programs themselves to enhance the state-of-the-practice in software documentation.
Typ des Eintrags: | Dissertation | ||||
---|---|---|---|---|---|
Erschienen: | 2021 | ||||
Autor(en): | Habib, Andrew | ||||
Art des Eintrags: | Erstveröffentlichung | ||||
Titel: | Learning to Find Bugs in Programs and their Documentation | ||||
Sprache: | Englisch | ||||
Referenten: | Mezini, Prof. Dr. Mira ; Pradel, Prof. Dr. Michael ; T. Devanbu, Prof. Dr. Premkumar | ||||
Publikationsjahr: | 2021 | ||||
Ort: | Darmstadt | ||||
Kollation: | xiv, 211 Seiten | ||||
Datum der mündlichen Prüfung: | 14 Dezember 2020 | ||||
DOI: | 10.26083/tuprints-00017377 | ||||
URL / URN: | https://tuprints.ulb.tu-darmstadt.de/17377 | ||||
Kurzbeschreibung (Abstract): | Although software is pervasive, almost all programs suffer from bugs and errors. To detect software bugs, developers use various techniques such as static analysis, dynamic analysis, and model checking. However, none of these techniques is bulletproof. This dissertation argues that learning from programs and their documentation provides an effective means to prevent and detect software bugs. The main observation that motivates our work is that software documentation is often under-utilized by traditional bug detection techniques. Leveraging the documentation together with the program itself, whether its source code or runtime behavior, enables us to build unconventional bug detectors that benefit from the richness of natural language documentation and the formal algorithm of a program. More concretely, we present techniques that utilize the documentation of a program and the program itself to: (i) Improve the documentation by inferring missing information and detecting inconsistencies, and (ii) Find bugs in the source code or runtime behavior of the program. A key insight we build on is that machine learning provides a powerful means to deal with the fuzziness and nuances of natural language in software documentation and that source code is repetitive enough to also allow statistical learning from it. Therefore, several of the techniques proposed in this dissertation employ a learning component whether from documentation, source code, runtime behavior, and their combinations. We envision the impact of our work to be two-fold. First, we provide developers with novel bug detection techniques that complement traditional ones. Our approaches learn bug detectors end-to-end from data and hence, do not require complex analysis frameworks. Second, we hope that our work will open the door for more research on automatically utilizing natural language in software development. Future work should explore more ideas on how to extract richer information from natural language to automate software engineering tasks, and how to utilize the programs themselves to enhance the state-of-the-practice in software documentation. |
||||
Alternatives oder übersetztes Abstract: |
|
||||
Status: | Verlagsversion | ||||
URN: | urn:nbn:de:tuda-tuprints-173778 | ||||
Sachgruppe der Dewey Dezimalklassifikatin (DDC): | 000 Allgemeines, Informatik, Informationswissenschaft > 004 Informatik | ||||
Fachbereich(e)/-gebiet(e): | 20 Fachbereich Informatik 20 Fachbereich Informatik > SOLA - Software Lab |
||||
Hinterlegungsdatum: | 08 Feb 2021 15:03 | ||||
Letzte Änderung: | 17 Feb 2021 09:00 | ||||
PPN: | |||||
Referenten: | Mezini, Prof. Dr. Mira ; Pradel, Prof. Dr. Michael ; T. Devanbu, Prof. Dr. Premkumar | ||||
Datum der mündlichen Prüfung / Verteidigung / mdl. Prüfung: | 14 Dezember 2020 | ||||
Export: | |||||
Suche nach Titel in: | TUfind oder in Google |
Frage zum Eintrag |
Optionen (nur für Redakteure)
Redaktionelle Details anzeigen |