The objective of the present invention is to provide a machine learning-based Chinese word segmentation apparatus capable of segmenting words in a Chinese sentence by using a hybrid method of combining a machine learning-based method and a heuristic-based longest matching method. To this end, the machine learning-based Chinese word segmentation apparatus comprises: a feature extraction part extracting features to be used for both machine learning and word segmentation; a machine learning part generating a learning model depending on a result of the machine learning; and a word segmentation part segmenting words in a Chinese sentence by using the features extracted in the feature extraction part and the learning model, wherein the feature extraction part uses a heuristic-based method, a method for using context as a feature, and a method for using linguistic characteristics as a feature.;COPYRIGHT KIPO 2017
展开▼