Thư viện tri thức trực tuyến
Kho tài liệu với 50,000+ tài liệu học thuật
© 2023 Siêu thị PDF - Kho tài liệu học thuật hàng đầu Việt Nam

Phát hiện tấn công phishing sử dụng lập trình gen và lựa chọn các đặc trưng
Nội dung xem thử
Mô tả chi tiết
Phạm Tuấn Anh và Đtg Tạp chí KHOA HỌC & CÔNG NGHỆ 122(08): 21 - 26
21
PHISHING ATTACKS DETECTION USING GENETIC PROGRAMMING
WITH FEATURES SELECTION
Tuan Anh Pham1
, Thi Huong Chu2
, Hoang Quan Nguyen2
,
Quang Uy Nguyen2
, Xuan Hoai Nguyen3
, Van Truong Nguyen4
1Centre of IT, Military Academy of Logistics, Vietnam, 2The Faculty of Information Technology, Le Quy Don University, Vietnam, 3IT R&D Center, Hanoi University, Vietnam, 4
College of Education, TNU, Vietnam
SUMMARY
Phishing is a real threat on the Internet nowadays. Therefore, fighting against phishing attacks is of
great importance. In this paper, we propose a solution to this problem by applying Genetic
Programming with features selection methods to phishing detection problem. We conducted the
experiments on a data set including both phishing and legitimate sites collected from the Internet.
We compared the performance of Genetic Programming with a number of other machine learning
techniques and the results showed that Genetic Programming produced the best solutions to
phishing detection problem.
Keywords: Genetic Programming, Phishing Attack, Machine Learning
INTRODUCTION*
Genetic Programming (GP) [2] is an
evolutionary algorithm aimed to provide
solutions to a user-defined task in the form of
computer programs. Since its introduction,
GP has been applied to many practical
problems [2]. GP has also been used as a
learning tool for solving some problems in
network security [3]. However, to the best of
our knowledge, there has not been any
published work on the use of GP for learning
to detect phishing web sites except our
preliminary work in [4].
In the field of network security, phishing
attack is one of the main threat on the Internet
nowadays [5]. Phishing attackers attempt to
acquire confidential information such as
usernames, passwords, and credit card details
by disguising as a trustworthy entity in an
online communication [5]. Due to the
simplicity, phishing attacks are very popular. .
According to a report released by an
American security firm, RSA, there have been
approximately 33,000 phishing attacks
globally each month in 2012, leading to a loss
of $687 million [1]. Therefore, detecting and
* Tel: 0915 016063, Email: [email protected]
eliminating phishing attacks is very important
for not only organizations but also
individuals. One popular and widely-used
solution with most web browsers is to
integrate blacklisted sites into them.
However, this solution, which is unable to
detect a new attack if the database is out of
date, appears to be not effective when there
are a large number of phishing attacks carried
out very day.
In a recent research [4], Pham et al. proposed
a solution to this problem by applying
Genetic Programming to phishing detection
problem. The results showed that GP
outperforms some other machine learning
methods on this important problem. However,
the research in [4] has some drawbacks.
1) The data set for training and testing was
rather small. Therefore, the models created
based on this data set may not generalize well
in the real environment.
2) More important, the number of features
used in [4] seems to be limited. Moreover,
some features may not be relevant for
distinguishing between phishing and
legitimate sites. This may hinder the
performance of machine learning methods in
solving this problem.