Malware Detection in Mobile Apps

A custom 2-layer neural network built from scratch (no ML frameworks) achieving 97.16% accuracy for malware detection via Android app permission analysis.

PythonScikit-learnPandasNumPyMatplotlib

The Problem

Android malware often requests excessive permissions — a flashlight app asking for SMS access is a red flag. I wanted to explore whether app permission patterns alone could reliably detect malicious apps, using a neural network built from scratch to deeply understand the math behind classification.

The Solution

I engineered a custom 2-layer neural network without any ML frameworks, implementing forward propagation, backpropagation, gradient descent, and activation functions manually in NumPy.

Feature engineering — Encoded Android permission requests as binary feature vectors (1 = requested, 0 = not).
Manual implementation — Wrote the full training loop including weight initialization, loss computation, and gradient updates from first principles.
Evaluation — Achieved 97.16% accuracy and 98.20% F1 score on a labeled dataset of benign and malicious apps.

What Went Wrong

The initial weight initialization used uniform random values, which caused vanishing gradients in the hidden layer during training — the loss plateaued early and accuracy stalled at ~85%.

The fix: I switched to He initialization (scaled by sqrt(2/n)) for the hidden layer weights, which maintained gradient magnitude through the network and allowed training to converge to 97%+ accuracy.

Results

97.16% accuracy, 98.20% F1 on malware classification
Fully manual forward and backward propagation — no framework abstractions
Transparent, interpretable model suitable for educational and security applications

Interested in working together?

Let's Talk