[Objective] This paper investigates the differences of users' search experience between two types of tasks, with and without time constraint, with the goal to understand how various contextual factors influence search behaviors and experience. [Methods] A user experiment is conducted, in which 40 undergraduate students participated, to search for two types of search tasks: Fact Finding (FF) and Information Understanding (IU), under two time conditions: with time constraint, and without time constraint. [Results] The results show that time constraint significantly shorten users' search time, writing time, and the amount of information produced in the notebook, and their new knowledge acquisition; in addition, users speed up the number of words produced in the documents. With respect to the task type effect, users' ratio of time on search and writing, the number of words produced on documents during search are not significantly influenced by task type, no matter whether there is time constraint or not. Very interestingly, when there is no time constraint, users spend longer time to complete the task, longer time on searching to accomplish IU tasks; however, there is no significant difference on the amount of information collected and the level of new knowledge acquisition. When there is time constraint, comparing with FF tasks, users collect more information but think they acquire lower level of new knowledge. [Limitations] This study is conducted in a lab environment, and the time constraint is manipulated by experimenters, which may not be the same as it is in real settings, so the results of this study may not be generalizable to various conditions. [Conclusions] This study has implications for understanding how time and search task type influence search behaviors for information search research.
[Objective] User interest is not static and it changes dynamically as time goes by, this paper proposes a user interest prediction model based on topic model and multi-time function. [Methods] Generate user interests by topic model, and calculate the weights of each user interest at every time point by applying multi-time function in order to predict user interest at next time point. [Results] Compared with memory-based user profile model and multi-step user profile model, cosine similarity and Kullback-Leibler divergence of the experimental results on search engine log data provided by Sogou Lab show that this model can predict user interests more effectively. [Limitations] The proposed method is only tested on search engine log data provided by Sogou Lab, and it need further examination on other data sets. [Conclusions] It is more effective to take every time point of user history data into consideration for user interest prediction.
[Objective] Solve the problem which only use the length of online product review to measure the review depth. [Methods] In this paper, a metrics-model for online product review depth is proposed. Firstly, on the basis of analyzing the demand information of customers for making decision, the concept of review depth is defined and feature concept tree of product is introduced. Secondly, the metrics-model for measuring product review depth is presented according to the features of the product review from domain experts and the distribution of product features over feature concept tree of product. [Results] Empirical study demonstrates that the metrics-model is identical to the model for review helpfulness, and the result shows that the model is feasible. [Limitations] This paper does not involve the product usage scenario of consumers and the review depth measurement for experience products. [Conclusions] The metrics-model can measure product review depth more accurately.
[Objective] This paper aims at the problem of product feature extraction, especially the noun phrase identification. [Methods] Chinese Chunk Parsing is used to extract the feature, and frequent sets are generated by Apriori. Then the candidate product features are filtered according to the rules of the minimum support, frequent nouns and TF-IDF. At last, the final product feature sets are obtained. [Results] In order to verify the effectiveness of the method, the car reviews are used in this paper, the average recall rate reaches 76.89%, the average precision rate reaches 84.03%. [Limitations] The recall rate is low and there is noun phrase identification error in the test. [Conclusions] Experiment results show that the method can extract product feature from Chinese reviews with good effects.
[Objective] Build an auto-indexing system by triple acquirement and NLP for Chinese scientific and technical literatures based on Ontology management and service platform. [Methods] Merging Ontology knowledge bases and vocabularies by Web services, the system can identify the terms and unlisted words through matching vocabulary and words combination, as well as link them with the triples in the knowledge bases for building a conceptual relational network. [Results] This system can process 86 articles per second with recall rate of 65% and precision rate of 69%. [Limitations] It takes a lot of time to match terms because no index is built. The performance of Chinese word segmentation and POS tagging are influenced by the noise data such as spaces, line break, and so on. [Conclusions] Data cleaning process and algorithm optimization of keywords selecting need continuous study for supporting the deep mining and enhancing the efficiency of the system.
[Objective] In traditional collaborative filtering algorithms, the issues such as data sparsity may make the quality of recommendation worse. This paper attempts to solve it by optimizing the recommendation mechanisms. [Methods] This paper uses cohesive subgroup analysis techniques to identify indirect trust relationship in trust networks, and combines with direct trust relationship to generate an integrated trust, which is used to calculate the user similarity in the new collaborative filtering recommendation algorithm. [Results] Experimental results show that the ultimate trust combining 35% direct and 65% indirect relationship can improve the accuracy of CF algorithms, and compared with only using direct trust relationship, the indirect trust relationship could not be ignored. [Limitations] When considering the indirect trust in the trust network, this paper ignores the impact of more intermediate nodes between two users. [Conclusions] Soft integration of indirect trust relationship can improve the recommendation accuracy of collaborative filtering algorithms.
[Objective] Solve the problem of rigid division of the traditional classification and some classification methods only dealing with discrete data. [Methods] The fuzzy comprehensive evaluation method is put forward to realize the fuzzy classification for continuous attributes samples, obtaining the soft classification of samples to categories. In the process, the method of continuous attributes discretization is used to divide attribute interval, and the particle swarm optimization algorithm is used to obtain the optimal weight distribution. The final results are the membership degrees of samples to each category. [Results] This method can effectively achieve the soft division of samples. [Limitations] This method is difficult to divide the attribute whose values is too concentrated. [Conclusions] This fuzzy classification method based on particle swarm optimization and fuzzy comprehensive evaluation is effective and feasible.
[Objective] To track topic sources and trends for a high-impact paper from its citation network. [Methods] Firstly, topics are detected for each paper by domain Ontology. Secondly, a citation network towards a single paper's topic is constructed. The nodes of the network are selected from second level cited papers, cited papers, citing papers and second level citing papers according to their contents. Thirdly, incremental cluster is applied for mining topic sources and trends from the network constructed before, the noisy sources or trends are filtered, and evolution paths of topics are formed. [Results] The structure changes and content changes of topic sources and trends are fully revealed. [Limitations] The screening conditions for the construction of citation network need to be further studied. Besides, completeness of the domain Ontology is not considered. [Conclusions] This study tracks topic sources and trends for single paper effectively, and helps reveal origin and development of the topics.
[Objective] Provide an alternative perspective for identifying influential authors. [Methods] This paper uses the weighted LeaderRank algorithm to measure author's impacts in coauthorship network. Respectively validates the effects of citations and the number of cooperation on sorting influential authors through different weighted algorithms. And base on the validation a new weighted algorithm named CW_LR is proposed by integrating these two factors. [Results] CW_LR algorithm is interrelated with citations, but compared with citations or other weighted algorithms, the result of CW_LR algorithm is more consistent with expert knowledge. [Limitations] This algorithm is tested in the informetrics research community, while further effectiveness validation in other research community is required. [Conclusions] The strength of cooperation and citation impact are considered at the same time in CW_LR algorithm, and this algorithm identifies the influential author more accurately from two dimensions.
[Objective] Discovering, analyzing and evaluating the research team of an organization or subject through the study of the weighted co-author network. [Methods] Build a comprehensive weighted model of the co-author network with the factors of co-author frequency, number, ranking, cited frequency and so on. Make an empirical research on the discovery and evaluation of research team with the method of social network analysis. [Results] The virtual team and its evaluating results got from this method is consistent with the research results about the actual team.The method can synthesize multiple influence factors, and objectively evaluate the structure and influence of the research team. [Limitations] In order to ensure the physical details of the actual team can be got to verify the research results, the authors choose their own institution as the evaluation object, and which makes the range of empirical research is narrow. The data type is unitary. [Conclusions] The method is suitable for the discovery of research team, the analysis of the relationship, and the evaluation of the construction within some scopes. These results are helpful to quickly know the team well, and provide the information for team optimization.
[Objective] Design and implement the STKOS term publishing and sharing service platform. [Context] As a metathesaurus, STKOS needs be published to public/organization users and promotes knowledge service and sharing. [Methods] Based on the research of international projects and systems about term service, analysing the features and requirements of STKOS, the framework of the system is designed and implemented. The key issues are discussed, including application scenarios, data exchange formats, data structures, visualization, multi-version management, etc. [Results] Under the scenario of magnitude data, STKOS term publishing and sharing service platform is developed. [Conclusions] The system can support STKOS data management and release, contents revealing in STKOS, and browsing, retrieval and customized download for users.
[Objective] Promote the construction efficiency and commonality of Internet TV multimedia resource in library and user experiences. [Context] Internet TV is one of important new media services in library. Designing suitable multimedia document structure for library is helpful to increase the efficiency of resource production and publication, and improve the commonality of resource. [Methods] According to the requirements of production, publication and exhibition of multimedia content, design an easy multimedia document structure named ZDS based on XML, and develop program tools to produce ZDS documents automatically. [Results] ZDS document can implement library multimedia materials ordered organization and packaging, and can be correctly resolved and showed on Internet TV. [Conclusions] This mode is useful to normalize construction procedures of library Internet TV resource, promote exchanging and sharing resource, and improve work efficiency.
[Objective] Extend mobile information services channels of library, improve the response speed and enhance user experience. [Context] The system is applied to the different platforms of mobile terminal, and realizes the real-time asynchronous transmission of library information. [Methods] Using WebSocket, the process is designed for the client to send the query, the server to resolve the query instruction, the library information query module and the information reply client. [Results] The readers can click client menu, conveniently access to library information service. [Conclusions] This system has high efficiency of data transmission, realizes the cross platform operation, and conducives to the expansion of library information service project.
[Objective] This paper aims to solve problems of automatic deployment in the data center of university library with the increased servers. [Context] With the explosion of servers and more development projects, the data center of university library can't undertake the heavy task. [Methods] Introducing the concept of automatic control, this paper uses automated scripts to maintain the infrastructure configuration of the data center. Use Vagrant to guarantee the coherence between development environment and production environment. [Results] This paper successfully solves two problems, including servers and operating systems of virtual machines reliance on components automatic deployment, and the inconsistency between development environment and production environment. Ultimately, reduce staff workload, and improve the development efficiency. [Conclusions] Applying automatic control strategies, the library's data center reaches the target of clearness, standardization, automation.