Class: NLP-1 (CS574, WST551)

Text analytics and knowledge mining

This class will study about the web text analysis, its link to LOD (Linked Open Data) and the knowledge extraction and enrichment based on NLP and data web technology. It includes the big data problem of web texts, the scalability and accuracy for text analytics, and its utilization for Semantic web and Question & answering. We will study the data web and knowledge mining from web text, a kind of big data. NLP with linked data is application-rich fundamental technologies that have been an essential for the emerging next-generation web-based knowledge mining.

See http://www.okbqa.org/ for open knowledge base and question answering

The aim of the course is to generate knowledge base from the web text (particularly from Wikipedia) with the backbone of DBpedia with its ontology through the machine learning technologies. We will study the whole process and make challenge for attacking the real problem about how to make a question answering and how to make a knowledge boosting with the data/knowledge fusion liking to the existing linked open data.  Reference model is the IBM Watson system for deep QA. 

이 과목에서는 텍스트 웹데이터를 분석하고, 링크드데이터 (LOD 등)를 생성하거나 연계하여 데이터웹과 지식에 대한 자연언어처리와 데이터웹에 대하여 공부한다. 웹텍스트는 Big Data로서 텍스트분석의 대량성과 정확성을 통하여 분석할 대상이며, 지식화하여 시맨틱웹과 QA (질의응답) 등에 활용된다. 빅데이터의 일종인 웹텍스트에서 데이터웹 및 지식형태로 만들어가는 지식마이닝 과정을 배움으로서 폭넓고 활용성이 강한 자연언어처리 기법 및 Linked data를 접목하여 공부하여 전세계적으로 다가오는 차세대웹기반 지식 마이닝에 대처함.

© Key-Sun Choi 2013-2015