This research aimed to develop a semi-automatic annotation system for annotating textual domain content that is a part of the intangible cultural heritage of Phattalung province in southern Thailand. A combination of unsupervised and supervised techniques for named entity recognition was adopted in the system. The unsupervised techniques were used to identify the named entity by using ontology to enable the system to provide annotating terms for users. However, if the system finds an ambiguous entity, then the user’s self-annotations are allowed and the system will record the annotations in the log file. The log file will then be provided as training sets for a classification model to enable the high effectiveness of named entity recognition since the system will classify the type of entity correctly when it discovers the ambiguous entity in a similar context. The system evaluation found that the effectiveness of named entity recognition was good with an average precision of 88.07 %, with an average recall of 82.10 %, while the effectiveness of the relationship extraction component with uncleaned sentence structure was low due to a limitation of natural language processing. However, if the structure of the sentence was cleaned, then the capability of the extraction would increase. Therefore, a user’s self-annotation is required for relationship annotation to increase the correctness of annotations.  


