文档介绍:SELF
OR GANIZING MAPS IN
NA TURAL LANGUA GE PR OCESSING
Timo Honk ela
Helsinki Univ ersit y of T ec hnology
w orks Researc h Cen tre
P
O
Bo x
FIN
HUT
FINLAND
Thesis for the degree of Do ctor of Philosoph y to be
presen ted with due p ermission for public examination
and criticism in the Auditorium F
of the Helsinki
Univ ersit yofT ec hnology on F rida y
thofDecem ber
at
o
clo c k no on
ESPOO
Abstract
Kohonen
s anizing Map
SOM
is one of the most p opular arti
cial
w ork algorithms
W ord category maps are SOMs that ha v e been
organized according to w ord similarities
measured b y the similarit y of the
short con texts of the w ords
Conceptually in terrelated w ords tend to fall in to
the same or neigh b oring map no des
No des ma y th us be view ed as w ord
categories
Although no a priori information ab out classes is giv en
during the
anizing pro cess a mo del of the w ord classes emerges
The cen tral topic of the thesis is the use of the SOM in natural language pro
cessing
The approac h based on the w ord category maps pared with the
metho ds that are widely used in arti
cial in telligence researc h
Mo deling gra
dience
conceptual c hange
and sub jectivit y of natural language in terpretation
are considered
The main application area is information retriev al and textual
data mining for whic h a sp eci
c SOM
based metho d called the WEBSOM
has b een dev elop ed
The WEBSOM metho anizes a do cumen t collection
on a map displa y that pro vides an o v erview of the collection and facilitates
in teractiv ebro wsing
Con ten ts
INTR ODUCTION
A UTHOR
S MOTIV A TION F OR THE
W ORK
THE SELF
OR GANIZING MAP
The anizing Map algorithm
Multiple views of the SOM
SELF
OR GANIZING MAPS IN NA TURAL LANGUA GE PR O
CESSING
W ord category maps
Other w orks based on and related to