digital duct-tape wizard...
Designed a dashboard
Designed and deployed a dashboard for easy exploration of spelling variations and entry errors in record data.
Resolved 500k+ Brazilian government records with our data in order to append new PII.
Developed an approximation to allow scaling of in-memory text clustering to hundreds of thousands of records on a single desktop machine with limited resources.
Gave an internal talk
Gave a tech talk on string distance metrics for fuzzy matching
Researched data privacy
Curated a privacy-conforming AML data set to support multiple university and NGO research initiatives.
Updated a GitHub repository
Published on github a proof of concept for discreetly watermarking a database extract in a way that scales to 2^128 file recipients.