Misplaced Pages

PU

Article snapshot taken from Wikipedia with creative commons attribution-sharealike license. Give it a read and then ask your questions in the chat. We can research this topic together.

In machine learning , one-class classification ( OCC ), also known as unary classification or class-modelling , tries to identify objects of a specific class amongst all objects, by primarily learning from a training set containing only the objects of that class, although there exist variants of one-class classifiers where counter-examples are used to further refine the classification boundary. This is different from and more difficult than the traditional classification problem, which tries to distinguish between two or more classes with the training set containing objects from all the classes. Examples include the monitoring of helicopter gearboxes, motor failure prediction, or the operational status of a nuclear plant as 'normal': In this scenario, there are few, if any, examples of catastrophic system states; only the statistics of normal operation are known.

#30969

27-679: [REDACTED] Look up PU , Pu , pu , or P U in Wiktionary, the free dictionary. On Misplaced Pages, "PU" may refer to Protected Userpage . PU , Pu , or pu may refer to: Organizations [ edit ] Political parties [ edit ] Ummah Party (Indonesia) ( Partai Ummat ), a political party in Indonesia People and names [ edit ] Pu (Chinese surname) , shared by several people Pu (Indian given name) , shared by several people Pu Ling-en (born 1936),

54-415: A Dutch video games magazine See also [ edit ] Poo (disambiguation) Peugh (disambiguation) Pue (disambiguation) Pew (disambiguation) Pugh (disambiguation) Pickup (disambiguation) Pull up (disambiguation) All pages with titles beginning with Pu Topics referred to by the same term [REDACTED] This disambiguation page lists articles associated with

81-449: A common type of plastic pu, a label in the per-unit system of power systems analysis PU learning , a collection of semisupervised techniques in machine learning PU leather or bicast leather, a material made with split leather and polyurethane PU resistor , a pull-up resistor PU scope , a sniper scope of Soviet origin Power unit , component that powers a machine, a vehicle, or

108-676: A d-dimensional object is given by: p N ( x ; μ ; Σ ) = 1 ( 2 π ) d 2 | Σ | 1 2 exp ⁡ { − 1 2 ( z − μ ) T Σ − 1 ( z − μ ) } {\displaystyle p_{\mathcal {N}}(x;\mu ;\Sigma )={\frac {1}{(2\pi )^{\frac {d}{2}}|\Sigma |^{\frac {1}{2}}}}\exp\{-{\frac {1}{2}}(z-\mu )^{T}\Sigma ^{-1}(z-\mu )\}} Where, μ {\displaystyle \mu }

135-448: A generating model that best fits the data. New objects can be described in terms of a state of the generating model. Some examples of reconstruction methods for OCC are, k-means clustering, learning vector quantization, self-organizing maps, etc. The basic Support Vector Machine (SVM) paradigm is trained using both positive and negative examples, however studies have shown there are many valid reasons for using only positive examples. When

162-574: A pen name of British poet J.H. Prynne Pu Yen (1900–2008), Thai centenarian who lived to an age of 108 Yingluck Shinawatra (born 1967), nicknamed Pu, Thai businesswoman and politician Places [ edit ] Pu County , in Shanxi, China Guinea-Bissau , a country in West Africa (NATO country code PU) Province of Pesaro and Urbino , a province in the Marche region of Italy Punjab, India ,

189-408: A small coherent subset of the data, using an information bottleneck approach. The term one-class classification (OCC) was coined by Moya & Hush (1996) and many applications can be found in scientific literature, for example outlier detection , anomaly detection , novelty detection . A feature of OCC is that it uses only sample points from the assigned class, so that a representative sampling

216-698: A state in northern India (postal code PU) Universities [ edit ] In India [ edit ] Panjab University, Chandigarh , a university in India Patna University , a university in Bihar, India Pondicherry University , a central university in Puducherry, India In the United States [ edit ] Parker University , a university in Dallas, Texas, United States Point University ,

243-592: A train. Almost equivalent to an engine. Other uses [ edit ] Pu (Taoism) , early Taoist metaphor for the natural state of humanity PU, an abbreviation for Proto-Ukrainian, an aspect of the Old East Slavic language pu, the Toki Pona name for the book Toki Pona: The Language of Good "Pu", a song by Arca from Kick IIIII pū , the Charonia tritonis PU, an abbreviation of Power Unlimited ,

270-716: A university in West Point, Georgia, United States Princeton University , a university in Princeton, New Jersey, United States Purdue University , a university in West Lafayette, Indiana, United States In other countries [ edit ] Purbanchal University , a university in Biratnagar, Nepal University of the Punjab , a public university in Lahore, Pakistan University of

297-893: Is assumed to contain both positive and negative samples, but without these being labeled as such. This contrasts with other forms of semisupervised learning, where it is assumed that a labeled set containing examples of both classes is available in addition to unlabeled samples. A variety of techniques exist to adapt supervised classifiers to the PU learning setting, including variants of the EM algorithm . PU learning has been successfully applied to text , time series, bioinformatics tasks, and remote sensing data. Several approaches have been proposed to solve one-class classification (OCC). The approaches can be distinguished into three main categories, density estimation , boundary methods , and reconstruction methods . Density estimation methods rely on estimating

SECTION 10

#1732852313031

324-405: Is compared to the target class, C {\displaystyle C} , and identified as an outlier or a member of the target class. One-class classification has similarities with unsupervised concept drift detection, where both aim to identify whether the unseen data share similar characteristics to the initial data. A concept is referred to as the fixed probability distribution which data

351-455: Is different from Wikidata All article disambiguation pages All disambiguation pages PU">PU The requested page title contains unsupported characters : ">". Return to Main Page . PU learning While many of the above approaches focus on the case of removing a small number of outliers or anomalies, one can also learn the other extreme, where the single class covers

378-412: Is drawn from. In unsupervised concept drift detection, the goal is to detect if the data distribution changes without utilizing class labels. In one-class classification, the flow of data is not important. Unseen data is classified as typical or outlier depending on its characteristics, whether it is from the initial concept or not. However, unsupervised drift detection monitors the flow of data, and signals

405-788: Is not strictly required for non-target classes. SVM based one-class classification (OCC) relies on identifying the smallest hypersphere (with radius r, and center c) consisting of all the data points. This method is called Support Vector Data Description (SVDD). Formally, the problem can be defined in the following constrained optimization form, min r , c r 2  subject to,  | | Φ ( x i ) − c | | 2 ≤ r 2 ∀ i = 1 , 2 , . . . , n {\displaystyle \min _{r,c}r^{2}{\text{ subject to, }}||\Phi (x_{i})-c||^{2}\leq r^{2}\;\;\forall i=1,2,...,n} However,

432-441: Is the mean and Σ {\displaystyle \Sigma } is the covariance matrix. Computing the inverse of covariance matrix ( Σ − 1 {\displaystyle \Sigma ^{-1}} ) is the costliest operation, and in the cases where the data is not scaled properly, or data has singular directions pseudo-inverse Σ + {\displaystyle \Sigma ^{+}}

459-503: Is used to approximate the inverse, and is calculated as Σ T ( Σ Σ T ) − 1 {\displaystyle \Sigma ^{T}(\Sigma \Sigma ^{T})^{-1}} . Boundary methods focus on setting boundaries around a few set of points, called target points. These methods attempt to optimize the volume. Boundary methods rely on distances, and hence are not robust to scale variance. K-centers method, NN-d, and SVDD are some of

486-1499: The Karush–Kuhn–Tucker conditions for optimality, we get c = ∑ i = 1 n α i Φ ( x i ) , {\displaystyle c=\sum _{i=1}^{n}\alpha _{i}\Phi (x_{i}),} where the α i {\displaystyle \alpha _{i}} 's are the solution to the following optimization problem: max α ∑ i = 1 n α i κ ( x i , x i ) − ∑ i , j = 1 n α i α j κ ( x i , x j ) {\displaystyle \max _{\alpha }\sum _{i=1}^{n}\alpha _{i}\kappa (x_{i},x_{i})-\sum _{i,j=1}^{n}\alpha _{i}\alpha _{j}\kappa (x_{i},x_{j})} subject to, ∑ i = 1 n α i = 1  and  0 ≤ α i ≤ 1 ν n for all  i = 1 , 2 , . . . , n . {\displaystyle \sum _{i=1}^{n}\alpha _{i}=1{\text{ and }}0\leq \alpha _{i}\leq {\frac {1}{\nu n}}{\text{for all }}i=1,2,...,n.} The introduction of kernel function provide additional flexibility to

513-476: The One-class SVM (OSVM) algorithm. A similar problem is PU learning , in which a binary classifier is constructed by semi-supervised learning from only positive and unlabeled sample points. In PU learning, two sets of examples are assumed to be available for training: the positive set P {\displaystyle P} and a mixed set U {\displaystyle U} , which

540-675: The Punjab, Gujranwala , a public university in Gujranwala, Pakistan University of Plovdiv , a public university in Plovdiv, Bulgaria Prešov University , a public university in Prešov, Slovakia Providence University , a university in Taichung, Taiwan Science and technology [ edit ] Plutonium , symbol Pu, a chemical element Processing unit , an electronic circuit that performs operations on some external data source Polyurethane ,

567-580: The SVM algorithm is modified to only use positive examples, the process is considered one-class classification. One situation where this type of classification might prove useful to the SVM paradigm is in trying to identify a web browser's sites of interest based only off of the user's browsing history. One-class classification can be particularly useful in biomedical studies where often data from other classes can be difficult or impossible to obtain. In studying biomedical data it can be difficult and/or expensive to obtain

SECTION 20

#1732852313031

594-884: The above formulation is highly restrictive, and is sensitive to the presence of outliers. Therefore, a flexible formulation, that allow for the presence of outliers is formulated as shown below, min r , c , ζ r 2 + 1 ν n ∑ i = 1 n ζ i {\displaystyle \min _{r,c,\zeta }r^{2}+{\frac {1}{\nu n}}\sum _{i=1}^{n}\zeta _{i}} subject to,  | | Φ ( x i ) − c | | 2 ≤ r 2 + ζ i ∀ i = 1 , 2 , . . . , n {\displaystyle {\text{subject to, }}||\Phi (x_{i})-c||^{2}\leq r^{2}+\zeta _{i}\;\;\forall i=1,2,...,n} From

621-540: The density of the data points, and set the threshold. These methods rely on assuming distributions, such as Gaussian, or a Poisson distribution . Following which discordancy tests can be used to test the new objects. These methods are robust to scale variance. Gaussian model is one of the simplest methods to create one-class classifiers. Due to Central Limit Theorem (CLT), these methods work best when large number of samples are present, and they are perturbed by small independent error values. The probability distribution for

648-703: The key examples. K-centers In K-center algorithm, k {\displaystyle k} small balls with equal radius are placed to minimize the maximum distance of all minimum distances between training objects and the centers. Formally, the following error is minimized, ε k − c e n t e r = max i ( min k | | x i − μ k | | 2 ) {\displaystyle \varepsilon _{k-center}=\max _{i}(\min _{k}||x_{i}-\mu _{k}||^{2})} The algorithm uses forward search method with random initialization, where

675-587: The radius is determined by the maximum distance of the object, any given ball should capture. After the centers are determined, for any given test object z {\displaystyle z} the distance can be calculated as, d k − c e n t r ( z ) = min k | | z − μ k | | 2 {\displaystyle d_{k-centr}(z)=\min _{k}||z-\mu _{k}||^{2}} Reconstruction methods use prior knowledge and generating process to build

702-601: The set of labeled data from the second class that would be necessary to perform a two-class classification. A study from The Scientific World Journal found that the typicality approach is the most useful in analysing biomedical data because it can be applied to any type of dataset (continuous, discrete, or nominal). The typicality approach is based on the clustering of data by examining data and placing it into new or existing clusters. To apply typicality to one-class classification for biomedical studies, each new observation, y 0 {\displaystyle y_{0}} ,

729-463: The title PU . If an internal link led you here, you may wish to change the link to point directly to the intended article. Retrieved from " https://en.wikipedia.org/w/index.php?title=PU&oldid=1223014423 " Categories : Disambiguation pages Educational institution disambiguation pages Disambiguation pages with surname-holder lists Hidden categories: Articles containing Hawaiian-language text Short description

#30969