#7

0 → 1 AI Systems for Privacy, Security, and Data Protection

Past Work

A six-month engineering project to take LLM-based data classification from proof of concept to production. Improved classifier accuracy (F1 score) from 50% to over 90% through quantitative evaluation and systematic optimization.

Sensitive Data Discovery with LLMs

Benchmarking the performance of frontier LLMs on tagging sensitive data in database schemas. Frontier models achieve ~80% recall and >80% precision on complex, realistic datasets, with smaller models nearly matching their performance.

Data Mapping at a Billion Dollar Self-Driving Vehicle Startup

USENIX PEPR 2022 talk on maintaining visibility on petabytes of sensitive data collected daily by a commercial fleet of self-driving vehicles–location, imagery, and copious metadata.