AJOU Central Library Repository: 연관도 계산과 기계학습을 이용한 주제 기반 웹 수집기

BROWSE

Special Graduate Schools Graduate School of Information and Communication Technology Department of Information and Communication 3. Theses(Master)

연관도 계산과 기계학습을 이용한 주제 기반 웹 수집기

DC Field	Value	Language
dc.contributor.advisor	최경희	-
dc.contributor.author	서혜성	-
dc.date.accessioned	2019-10-21T06:46:26Z	-
dc.date.available	2019-10-21T06:46:26Z	-
dc.date.issued	2005	-
dc.identifier.other	446	-
dc.identifier.uri	https://dspace.ajou.ac.kr/handle/2018.oak/16465	-
dc.description	학위논문(석사)--아주대학교 정보통신전문대학원 :정보통신 공학과,2005	-
dc.description.abstract	인터넷을 사용하는 사람들에게 그들의 관심사와 부합하는 웹 페이지를 제공하는 것은 필수 불가결 하다. 이러한 관점에서 본 논문은 각 웹 페이지의 주제와 연관된 정도 (degree of relevance)를 계산하며, 단어빈도/문서빈도 (term frequency/document frequency), 엔트로피 (entropy) 및 컴파일된 규칙을 이용하여 수집된 웹 페이지를 정제하는 주제 기반 웹 수집기 (topic-specific Web crawler)를 제안한다. 실험을 통하여 주제 기반 웹 수집기에 대한 분류의 정확성, 수집의 효율성 및 수집의 일관성을 평가하였다. 77개의 대표적인 단어를 사용하여 실험한 경우, 수집된 결과가 주어진 주제와 부합하는 분류 정확성은 평균 90.2% 로 측정되었다.	-
dc.description.tableofcontents	본문 차례(List of Text) 제 1 장 서론 = 1 제 2 장 관련 연구 = 3 제 3 장 주제 기반 웹 수집기 = 6 제 1 절 연관도 (Degree of Relevance)의 계산 = 6 제 2 절 기계 학습을 이용한 웹 페이지 분류 = 9 제 4 장 실험 및 평가 = 13 제 1 절 분류의 정확도 = 13 제 2 절 수집의 효율성 = 16 제 3 절 수집의 일관성 = 17 제 5 장 결론 = 19 참고 문헌 = 20\|그림 차례(List of Figure) 그림 1. 주제 기반 웹 수집기의 수집 알고리즘 = 7 그림 2. 연관도 식에서 ρ값의 역할 = 8 그림 3. 네 개의 범주에 대한 임계값 μ의 변화에 따른 분류 성능의 비교 = 14 그림 4. 하위 범주에 대한 임계값 μ의 변화에 따른 평균 분류 성능의 비교 = 15 그림 5. 네 개의 범주에 대한 ρ값의 변화에 따른 수집 효율성의 비교 = 16 그림 6. 네 개의 범주에 대한 수집의 일관성의 비교. 각각의 수집기는 5000개의 URL을 수집한 후 수집을 중단하였음 = 18	-
dc.language.iso	kor	-
dc.publisher	The Graduate School, Ajou University	-
dc.rights	아주대학교 논문은 저작권에 의해 보호받습니다.	-
dc.title	연관도 계산과 기계학습을 이용한 주제 기반 웹 수집기	-
dc.title.alternative	Topic Specific Web Crawler Using Degree of Relevance and Machine Learning	-
dc.type	Thesis	-
dc.contributor.affiliation	아주대학교 정보통신전문대학원	-
dc.contributor.department	정보통신전문대학원 정보통신공학과	-
dc.date.awarded	2005. 2	-
dc.description.degree	Master	-
dc.identifier.localId	T000000000446	-
dc.identifier.url	http://dcoll.ajou.ac.kr:9080/dcollection/jsp/common/DcLoOrgPer.jsp?sItemId=000000000446	-
dc.description.alternativeAbstract	It is indispensable that the users surfing on the Internet could have web pages classified into a given topic as correct as possible. Toward this ends, this paper presents a topic-specific crawler computing the degree of relevance and refining the preliminary set of related web pages using term frequency/document frequency, entropy, and compiled rules. In the experiments, we test our topic-specific crawler in terms of the accuracy of its classification, the crawling efficiency, and the crawling consistency. In case of using 77 representative terms, it turned out that the resulting accuracy of the classification was, on the average, 90.2%.	-

Appears in Collections:: Special Graduate Schools > Graduate School of Information and Communication Technology > Department of Information and Communication > 3. Theses(Master)

Files in This Item:: There are no files associated with this item.

Show simple item record

qrcode

트윗하기

License

STATISTICS: Total Visit :5,266,567; Total Download :2,137; Today View :3,259

AJOU Central Library Repository는 국립중앙도서관 OAK 보급사업으로 구축되었습니다.

BROWSE

Browse