Paris-Networking

About Paris-Networking | Announce a talk | Subscribe

Seminar: Identifying Suspicious URLs: An Application of Large-Scale Online Learning  

Justin Ma, University of California San Diego

Monday, July 6th 2009, 16h00 - 17h00

Location :

LIP6 
104, avenue du President Kennedy
Room 550
75016, Paris 

RER: Ligne C, station "Avenue du President Kennedy - Maison de Radio-France"
Métro : Ligne 6, station "Passy" 

Abstract :

We explore online learning approaches for detecting
malicious Web sites (those involved in criminal scams) using lexical
and host-based features of the associated URLs.  We show that this
application is particularly appropriate for online algorithms as the
size of the training data is larger than can be efficiently processed
in batch and because the distribution of features that typify
malicious URLs is changing continuously. Using a real-time system we
developed for gathering URL features, combined with a real-time source
of labeled URLs from a large Web mail provider, we demonstrate that
recently-developed online algorithms can be as accurate as batch
techniques, achieving daily classification accuracies up to 99% over a
balanced data set.

Bio:
Justin Ma is a PhD candidate at UC San Diego advised by Stefan
Savage, Geoff Voelker and Lawrence Saul.  His research interests are
in systems and networking with an emphasis on network security, and
his current focus is the application of machine learning to problems
in security.

Host :

NPA Group