TU Darmstadt / ULB / TUbiblio

A machine-learning approach for classifying and categorizing android sources and sinks

Rasthofer, Siegfried and Arzt, Steven and Bodden, Eric (2014):
A machine-learning approach for classifying and categorizing android sources and sinks.
In: 2014 Network and Distributed System Security Symposium (NDSS), [Article]

Abstract

Today’s smartphone users face a security dilemma: many apps they install operate on privacy-sensitive data, although they might originate from developers whose trustworthiness is hard to judge. Researchers have addressed the problem with more and more sophisticated static and dynamic analysis tools as an aid to assess how apps use private user data. Those tools, however, rely on the manual configuration of lists of sources of sensitive data as well as sinks which might leak data to untrusted observers. Such lists are hard to come by.

We thus propose SUSI, a novel machine-learning guided approach for identifying sources and sinks directly from the code of any Android API. Given a training set of hand-annotated sources and sinks, SUSI identifies other sources and sinks in the entire API. To provide more fine-grained information, SUSI further categorizes the sources (e.g., unique identifier, location information, etc.) and sinks (e.g., network, file, etc.).

For Android 4.2, SUSI identifies hundreds of sources and sinks with over 92% accuracy, many of which are missed by current information-flow tracking tools. An evaluation of about 11,000 malware samples confirms that many of these sources and sinks are indeed used. We furthermore show that SUSI can reliably classify sources and sinks even in new, previously unseen Android versions and components like Google Glass or the Chromecast API.

Item Type: Article
Erschienen: 2014
Creators: Rasthofer, Siegfried and Arzt, Steven and Bodden, Eric
Title: A machine-learning approach for classifying and categorizing android sources and sinks
Language: English
Abstract:

Today’s smartphone users face a security dilemma: many apps they install operate on privacy-sensitive data, although they might originate from developers whose trustworthiness is hard to judge. Researchers have addressed the problem with more and more sophisticated static and dynamic analysis tools as an aid to assess how apps use private user data. Those tools, however, rely on the manual configuration of lists of sources of sensitive data as well as sinks which might leak data to untrusted observers. Such lists are hard to come by.

We thus propose SUSI, a novel machine-learning guided approach for identifying sources and sinks directly from the code of any Android API. Given a training set of hand-annotated sources and sinks, SUSI identifies other sources and sinks in the entire API. To provide more fine-grained information, SUSI further categorizes the sources (e.g., unique identifier, location information, etc.) and sinks (e.g., network, file, etc.).

For Android 4.2, SUSI identifies hundreds of sources and sinks with over 92% accuracy, many of which are missed by current information-flow tracking tools. An evaluation of about 11,000 malware samples confirms that many of these sources and sinks are indeed used. We furthermore show that SUSI can reliably classify sources and sinks even in new, previously unseen Android versions and components like Google Glass or the Chromecast API.

Journal or Publication Title: 2014 Network and Distributed System Security Symposium (NDSS)
Divisions: LOEWE > LOEWE-Zentren > CASED – Center for Advanced Security Research Darmstadt
20 Department of Computer Science > EC SPRIDE
20 Department of Computer Science > EC SPRIDE > Secure Software Engineering
Zentrale Einrichtungen
LOEWE
20 Department of Computer Science
LOEWE > LOEWE-Zentren
Date Deposited: 24 Nov 2014 14:19
Export:

Optionen (nur für Redakteure)

View Item View Item