»²²Ã¤´´õ˾¤ÎÊý¤Ï¡¢£Ó£É£Ç£Í£Ï£Ä¥Û¡¼¥à¥Ú¡¼¥¸¤Ë¤Æ
¡Ê http://www.sigmodj.is.uec.ac.jp/¡Ë
²ñ°÷ÅÐÏ¿¤Î¸å¡Ê²ñÈñ̵ÎÁ¡¢¤¹¤Ç¤ËÅÐÏ¿¤µ¤ì¤Æ¤¤¤ëÊý¤Ï·ë¹½¤Ç¤¹¡Ë¡¢
sigmodj_lecture@tkl.iis.u-tokyo.ac.jp¤Ë
¿½¹þ½ñ¤ò¤ªÁ÷¤ê²¼¤µ¤¤¡£
´îÏ¢Àî(ÅìÂç¡¢SIGMOD-J Chair)
Ï¢ÍíÀè¡¡ACM SIGMODÆüËÜ»ÙÉô
sigmodj_lecture@tkl.iis.u-tokyo.ac.jp
http://www.sigmodj.is.uec.ac.jp/
8<------------------------------------
To: sigmodj_lecture@tkl.iis.u-tokyo.ac.jp
ACM SIGMODJ £±£±·î£²£¶Æü¹Ö±é²ñ »²²Ã¿½¤·¹þ¤ßÍÑ»æ
¤ªÌ¾Á°:
¤´½ê°:
e-mail:
8<------------------------------------
¡ù¹Ö±é²ñ¡¡ ====================================================
¥¿¥¤¥È¥ë¡§Clustering Web Documents
¹Ö±é¼Ô¡§¡¡Prof. Mukesh Mohania (IBM India)
Æü»þ¡§¡¡¡¡11·î26Æü¡¡¸á¸å5»þȾ ¡¼ 6»þȾ
¾ì½ê¡¡¡¡ ¡¡ÅìµþÂç³ØÀ¸»ºµ»½Ñ¸¦µæ½ê¡Ê¶ð¾ì¥¥ã¥ó¥Ñ¥¹¶¡¡
EÅï¡¿À¾Â¦¡¿5³¬²ñµÄ¼¼A,B¡¡(Ew-501,502)¡¡
TEL 03(5452)6254/6256
http://www.iis.u-tokyo.ac.jp/map/index.html
¾®ÅĵÞÀþ ÅìËÌÂô(°ìÈֶᤤ¤Ç¤¹)¤«¤é£·Ê¬
¾®ÅĵÞÀþ Â塹Ìھ帶¤«¤é£±£²Ê¬
°æ¤ÎƬÀþ ÅìÂç¶ð¾ìÁ°¤«¤é£±£°Ê¬
ÅìÂç¶ð¾ì±Ø¤«¤é¤¤¤é¤Ã¤·¤ã¤ëÊý¤ÏÅìÌ礫¤é¤ªÆþ¤ê²¼¤µ¤¤¡£
»²²ÃÈñÍÑ¡¡ ̵ÎÁ¡¡
----------------------------------------------------------
Title: Clustering Web Documents
Speaker: Prof. Mukesh Mohania (IBM India)
Abstract:
Users are increasingly relying on search engines to obtain
useful information from the web. It is becoming more and more
difficult for users to find relevant information as a large number of
documents are returned as a result of a search. Hence, in order to
make the search, it is necessary to categorize documents into sets
(i.e. clusters) based on some subject or similarity. A way to cluster
documents based on relative similarity between them will be explored
in this talk. The documents are scanned and important keywords or
document representatives are obtained from each document. Weights are
assigned to these keywords based on their location in the document,
frequency and various other factors. We will then discuss the
Row-Column Iterative Algorithm that is applied on the set of N
documents to form clusters based on relative similarity of
documents. We will also discuss some on-going research projects.