Thư viện tri thức trực tuyến
Kho tài liệu với 50,000+ tài liệu học thuật
© 2023 Siêu thị PDF - Kho tài liệu học thuật hàng đầu Việt Nam

Nonparametric maximum likelihood estimation using neural networks
Nội dung xem thử
Mô tả chi tiết
Pattern Recognition Letters 138 (2020) 580–586
Contents lists available at ScienceDirect
Pattern Recognition Letters
journal homepage: www.elsevier.com/locate/patrec
Nonparametric maximum likelihood estimation using neural networks
Hieu Trung Huynha,b,∗
, Linh Nguyena,c
a Faculty of Information Technology, Industrial University of Ho Chi Minh City, Ho Chi Minh City, Vietnam b Faculty of Engineering, Vietnamese-German University, Binh Duong City, Vietnam c Department of Mathematics, University of Idaho, Moscow, ID 83844-1103, United States
a r t i c l e i n f o
Article history:
Received 2 November 2019
Revised 3 July 2020
Accepted 8 September 2020
Available online 8 September 2020
Keywords:
Neural network
Maximum likelihood estimation
Nonparametric
Probability density function
a b s t r a c t
Estimation of probability density functions is an essential component of various applications. Nonparametric techniques have been widely used for this task owing to the difficulty in parameterization of data.
In particular, certain kernel density estimation methods have been developed. However, they are either
incapable of maximum likelihood estimation or require the maintenance of a training set to process new
patterns. In this study, a new approach, called the nonparametric maximum likelihood neural network
(MLNN), is proposed. This is a nonparametric method, relying on maximum likelihood and neural network. It is compact in form and does not require the maintenance of training patterns. Theoretical and
experimental analyses demonstrate the efficacy of the proposed approach.
© 2020 Elsevier B.V. All rights reserved.
1. Introduction
The estimation of probability density functions (pdfs) is considered an essential topic in the statistical modeling. It expresses
random variables as functions of other variables based on the observed data or reveals the underlying properties of data distributions [5]. Several frameworks and real-world applications have
been reported where estimation of pdfs of collected data is fundamental, including the Bayesian decision rule, behavioral prediction, medical science, genomic analysis, bioinformatics, data compression, model selection, etc. [3,11,27,29].
Let TU = {x1, x2,…, xn} be a set of random vectors in the Ddimensional space, Rd, corresponding to independent identically
distributed (iid) observations from a true pdf, f(x). The primary aim
of pdf estimation is to construct an estimator ˆf(x) that exhibits
the following properties: (1) ˆf(x) is the closest possible approximation of f(x) and asymptotically converges to f(x) as n→∞, (2)
ˆf(x) should be unbiased, i.e., E[ ˆf(x)] = f(x). Further, the optimal
estimator is expected to be easy to compute based on the collected
data and exhibit a minimum variance over all possible estimators.
Parametric and nonparametric approaches are two popular pdf
estimation techniques [5]. The parametric technique assumes that
the form of f(x) is known and that it is parameterized by a vector θ, i.e., f(x) = f(x,θ). In this case, pdf estimation is reduced to
the task of estimating the optimal values of the parameters based
on the available data. One of the most popular approaches to this
∗ Corresponding author.
E-mail address: [email protected] (H.T. Huynh).
task is based on the Gaussian mixture model with maximum likelihood (ML) estimation. This approach often lends itself to efficient
computation. However, the assumption about the pdf’s form might
not be correct, which can result in failure to achieve the desirable
properties. However, nonparametric estimation is preferable in the
case of statistical modeling where the pdfs are difficult to parameterize. Unlike the parametric technique, the nonparametric method
does not need to assume any specific form of f(x) and, instead, attempts to estimate the pdf’s form directly based on the observed
data.
One of the most fundamental nonparametric estimators is the
histogram, which approximates data density by using a set of rectangular functions with known locations and widths (bins). The
height of each rectangle is determined by the number of data patterns within the corresponding bin. This technique is simple; however, it is prone to generating discontinuous pdfs that depend on
both the locations of bin centers and their widths. To overcome
this problem, kernel density estimation (KDE)-based approaches
have been developed [5]. Parzen window (PW) is one of the most
popular approaches based on kernel functions, which relies on a
combination of kernels to estimate the pdf. Even though PW is effective, kernel density estimators require a predefined bandwidth
and are incapable of maximizing likelihood. Moreover, selecting
the optimal kernel function is challenging. A nonparametric density estimator that is capable of overcoming these issues and maximizing the likelihood was proposed by Rahul Agarwal et al. [1].
In this method, the estimator is determined based on the class of
functions whose square roots include Fourier transforms with finite support. It is smooth, nonnegative, efficiently computable, and
https://doi.org/10.1016/j.patrec.2020.09.006
0167-8655/© 2020 Elsevier B.V. All rights reserved.