通知公告:

科研成果

陈静,王甜甜,陆泉:THC-DAT: a document analysis tool based on topic hierarchy and context information

2016-03-31  浏览:[]

【摘要】Purpose-This paper proposed a novel within-document analysis tool THC-DAT (topic hierarchy and context based document analysis tool)which  enables  users  to interactively analyze any multi-topic document based on fine-grained and hierarchical topics  automatically extracted from it. THC-DAT used hLDA (hierarchical  Latent Dirichlet Allocation) method and took the context information into account so that it can reveal the relationships between latent topics and related texts in a document.Design/methodology/approach-The  methodology is a case study. The authors reviewed the related literature first, then utilized a general "build and test" research model. After explaining the model, interface and functions of THC-DAT, a case study was presented using a scholarly paper that was analyzed with the tool.Findings-THC-DAT can organize and serve document topics and texts hierarchically and context-based, which overcomes the drawbacks oftraditional document analysis tools. The navigation, browse, search and comparison functions of THC-DAT enable users to read, search and analyze multi-topic document efficiently and effectively.Practical implications-It can improve the document organization and services in digital libraries or e-readers, by helping users to interactively read, search and analyze documents efficiently and effectively, exploringly learn about unfamiliar topics with little cognitive burden, or deepen their understanding of a document.Originality/value-This paper designs a tool THC-DAT to analyze document in a topic hierarchy and context  way.It contributes to overcoming the coarse-analysis drawbacks of existing within-document analysis tools.

 

【关键词】analysis, hLDA, Context information, Multi-topic documents,Digital libraries, e-readers.

 

  该文发表于《Library Hi Tech》2016年第34期