This paper presents a variant of Haar-Iike feature used in Viola and Jones detection framework, called scattered rectangle feature, based on the common-component analysis of local region feature. Three common components, feature filter, feature structure and feature form, are extracted without concerning the details of the studied region features, which cast a new light on region feature design for specific applications and requirements: modifying some component(s) of a feature for an improved one or combining different components of existing features for a new favorable one. Scattered rectangle feature follows the former way, extending the feature structure component of Haar-like feature out of the restriction of the geometry adjacency rule, which results in a richer representation that explores much more orientations other than horizontal, vertical and diagonal, as well as misaligned, detached and non-rectangle shape information that is unreachable to Haar-Iike feature. The training result of the two face detectors in the experiments illustrates the benefits of scattered rectangle feature empirically; the comparison of the ROC curves under a rigid and objective detection criterion on MIT+CMU upright face test set shows that the cascade based on scattered rectangle features outperforms that based on Haar-Iike features.
Classification of network traffic is the essential step for many network researches. However, with the rapid evolution of Internet applications the effectiveness of the port-based or payload-based identification approaches has been greatly diminished in recent years. And many researchers begin to turn their attentions to an alternative machine learning based method. This paper presents a novel machine learning-based classification model, which combines ensemble learning paradigm with co-training techniques. Compared to previous approaches, most of which only employed single classifier, multiple classifters and semi-supervised learning are applied in our method and it mainly helps to overcome three shortcomings: limited flow accuracy rate, weak adaptability and huge demand of labeled training set. In this paper, statistical characteristics of IP flows are extracted from the packet level traces to establish the feature set, then the classification model is crested and tested and the empirical results prove its feasibility and effectiveness.
HE HaiTaoLUO XiaoNanMA FeiTengCHE ChunHuiWANG JianMin