Current Issue
    , Volume 2 Issue 12 Previous Issue    Next Issue
    For Selected: View Abstracts Toggle Thumbnails
    Analyzing Interdisciplinarity and Scientists’ Academic Impacts
    Li Dong,Tong Shouchuan,Li Jiang
    2018, 2 (12): 1-11.  DOI: 10.11925/infotech.2096-3467.2018.0452
    Abstract   HTML ( 12 PDF(6003KB) ( 346 )  

    [Objective] This paper explores the relationship between the scientists’ interdisciplinary knowledge and their academic impacts. [Methods] First, we collected 200 candidates from the 2016 National Natural Science Foundation Outstanding Youth Program and their articles indexed by the Web of Science. Then, we retrieved interdisciplinary co-authorship and citation data. Third, we used Brillouin’s index as a measure of interdisciplinarity and h index as a measure of academic influence. Finally, we calculated the correlation coefficients between interdisciplinarity and academic influence. [Results] We found no significant correlation between inter-disciplinary collaboration and academic influence except for the field of biology, and no significant correlation between interdisciplinary citations and academic influence except the areas of medicine or biology. [Limitations] Deciding a scientist’s discipline based on his/her affiliation might be biased. [Conclusions] A scientist’s interdisciplinary collaborations and citations are not necessarily correlated to his/her academic influence.

    Figures and Tables | References | Related Articles | Metrics
    Identifying Useful Information from Open Innovation Community
    Li He,Zhu Linlin,Yan Min,Liu Jincheng,Hong Chuang
    2018, 2 (12): 12-22.  DOI: 10.11925/infotech.2096-3467.2018.0393
    Abstract   HTML ( 30 PDF(900KB) ( 291 )  

    [Objective] The paper aims to identify useful message from open innovation community with numerous redundant and low quality information. [Methods] First, we retrieved 23,137 users’ comments on programming bugs from the official Xiaomi MIUI Forum based on the information adoption model. Then, we applied binary logistic regression method to explore factors affecting the usefulness of these comments. [Results] The timeliness of information had positive impact on their usefulness, the integrity of information also affected their usefulness, and the semantics of information had negative effects on their usefulness. The users’ previous experience did not influence the usefulness of information. However, users’ previous contribution had positive effects on the usefulness of information. [Limitations] The research data was collected from small portion of one community, which might yield biased results. [Conclusions] This paper could help us effectively identify usefulness information from open innovation communities.

    Figures and Tables | References | Related Articles | Metrics
    Comprehending Texts and Answering Questions Based on Hierarchical Interactive Network
    Cheng Yong,Xu Dekuan,Lv Xueqiang
    2018, 2 (12): 23-32.  DOI: 10.11925/infotech.2096-3467.2018.0583
    Abstract   HTML ( 4 PDF(1865KB) ( 183 )  

    [Objective] This paper aims to help computer answer questions accurately based on text comprehension. [Methods] First, we proposed a neural network model based on hirrarchical interaction mechanism. We introduced various human thinking mechanism to build this model, which contained hierarchical processing, content filtering and multi-dimensional attention. Then, we ran the proposed model with dataset from Chinese Machine Reading Comprehension (CMRC) 2017. [Results] The precision of the proposed method on test-set was 0.78, which was better than the best result of other published models. [Limitations] There was no further optimization for the potential answers. [Conclusions] The proposed hierarchical interactive network improves machine’s ability to answer questions based on text comprehension.

    Figures and Tables | References | Related Articles | Metrics
    Predicting Stock Prices with Text and Price Combined Model
    Yu Chuanming,Gong Yutian,Wang Feng,An Lu
    2018, 2 (12): 33-42.  DOI: 10.11925/infotech.2096-3467.2018.0420
    Abstract   HTML ( 4 PDF(761KB) ( 679 )  

    [Objective] This paper tries to predict stock price fluctuation with the help of big data, aiming to improve the accuracy of the forecasting and reduce the trading risks. [Methods] We proposed a new Text and Price Combined Model (TPCM) to process comments retrieved from a stock forum. Then, we employed deep representation learning algorithm to generate text feature matrix and utilized the K-means clustering method to generate text category. Finally, we used the Multi-Layer Perceptron (MLP) to predict stock price fluctuation based on the opening price, closing price and other 15 original price indicators. [Results] The accuracy of TPCM was 65.91%, which was 7.76% higher than that of the model (58.15%) employing price features only, and 11.37% higher than that of the model (54.54%) employing text features only. [Limitations] The study only used one stock to examine the proposed model. [Conclusions] Stock price forecasting could be improved through the combination of text and price, which creates novel perspectives for future studies.

    Figures and Tables | References | Related Articles | Metrics
    Building Interactive Knowledge Map for Academic Search
    Liu Ping,Li Yanan,Yu Cong
    2018, 2 (12): 43-51.  DOI: 10.11925/infotech.2096-3467.2018.0419
    Abstract   HTML ( 3 PDF(3740KB) ( 237 )  

    [Objective] This paper presents an approach to construct interactive knowledge map that facilitates browsing and keyword searching. [Methods] Firstly, we modeled academic resources to reveal the implicit knowledge nodes and their complex relationship. Then, we built the interactive knowledge map based on user queries, which suggested associated terms and presented results in lattice. [Results] We examined the proposed method with documents from Proceedings of the International ACM SIGIR Conference in recent 10 years. We discovered hidden knowledge structure helping users locate core concepts and improve searching. [Limitations] The recommendation of relevant concepts needs to be improved. [Conclusions] The proposed interactive knowledge map help users effectively explore the information space.

    Figures and Tables | References | Related Articles | Metrics
    Identifying Trending Topics in Q&A Community with CART Decision Tree
    Cheng Xiufeng,Zhang Xinyi,Wang Ning
    2018, 2 (12): 52-59.  DOI: 10.11925/infotech.2096-3467.2018.0415
    Abstract   HTML ( 3 PDF(591KB) ( 413 )  

    [Objective] This paper tries to identify the trending topics, aiming to help the decision-making agencies manage online public opinion. [Methods] Firstly, we proposed the criteria to detect the trending topics of Q&A community. Then, we conducted an empirical study on China’s Zhihu Q&A community using the CART decision tree algorithm. [Results] The CART decision tree predicted the trending topics. [Limitations] We only collected data from a small portion of all topics on Zhihu. More data is needed for future studies. [Conclusions] The proposed method based on the CART decision tree algorithm could effectively predict trending topics in the Q&A community, which help us choose popular contents.

    Figures and Tables | References | Related Articles | Metrics
    Identifying Weibo Opinion Leaders with Social Network Analysis and Influence Diffusion Model
    Chen Fen,Fu Xi,He Yuan,Xue Chunxiang
    2018, 2 (12): 60-67.  DOI: 10.11925/infotech.2096-3467.2018.0200
    Abstract   HTML ( 4 PDF(559KB) ( 350 )  

    [Objective] This paper tries to identify Weibo opinion leaders with the help of social network analysis and influence diffusion model. [Methods] First, we analyzed the opinion leaders’ characteristics based on the social network analysis. Then we optimized the existing influence diffusion model from the perspectives of impact scope and extent. Finally, we applied the new model to find opinion leaders. [Results] Compared with the models built on centrality analysis or semantic similarity, the optimized model obtained better ranking for opinion leaders, which was consistent with the Weibo data. [Limitations] Only examined the proposed method with data on GMO foods. [Conclusions] The proposed model could effectively identify the Weibo opinion leaders.

    Figures and Tables | References | Related Articles | Metrics
    Classifying Chinese Texts with CapsNet
    Feng Guoming,Zhang Xiaodong,Liu Suhui
    2018, 2 (12): 68-76.  DOI: 10.11925/infotech.2096-3467.2018.0391
    Abstract   HTML ( 3 PDF(732KB) ( 432 )  

    [Objective] This study tries to address the issues facing long text representation and use CapsNet to improve the accuracy of Chinese text classification. [Methods] First, we proposed a LDA matrix and word vector to represent the long texts. Then, we constructed a Chinese classification model based on CapsNet. Third, we examined the proposed model with Sogou news corpus and the text classification corpus of Fudan University. Finally, we compared our results with those of the classic models (e.g., TextCNN, DNN and so on). [Results] The performance of CapsNet model was better than other models. The classification accuracy in five categories of short and long texts reached 89.6% and 96.9% respectively. The convergence speed of the proposed model was almost two times faster than that of the CNN model. [Limitations] The computational complexity of the model is high, which limits the size of testing corpus. [Conclusions] The proposed Chinese text representation method and the modified CapsNet model have better accuracy, convergence speed and robustness than the existing ones.

    Figures and Tables | References | Related Articles | Metrics
    Clustering Social Tags with Improved DBSCAN Algorithm
    Xiong Huixiang,Ye Jiaxin,Jiang Wuxuan
    2018, 2 (12): 77-88.  DOI: 10.11925/infotech.2096-3467.2018.0358
    Abstract   HTML ( 4 PDF(631KB) ( 194 )  

    [Objective] This paper tries to improve the DBSCAN algorithm and verify its feasibility and effectiveness in social tagging. [Methods] First, we analyzed the frequency of social tags for resources and their total appearances. Then, we examined the relationship between tags and resources to improve the DBSCAN clustering algorithm. Finally, we applied the new algorithm to cluster tags, and users. [Results] We ran our experiment with data from Douban Movies. The modified DBSCAN algorithm improved the inter-object and inter-cluster correlations of social taggings. [Limitations] The sample datasets need more in-depth mining. [Conclusions] The improved DBSCAN algorithm could effectively cluster social tags.

    Figures and Tables | References | Related Articles | Metrics
    Converting STKOS Metathesaurus to RDF Triples with R2RML
    Wang Ying,Wu Sizhu
    2018, 2 (12): 89-97.  DOI: 10.11925/infotech.2096-3467.2018.0423
    Abstract   HTML ( 3 PDF(1107KB) ( 225 )  

    [Objective] This paper aims to convert STKOS Metathesaurus from records of relational database to RDF triples. [Methods] First, we defined the semantic schema of the STKOS based on their storage features and data characteristics. Then, we mapped the scientific terms, standard concepts, categories, as well as source concepts and terms with the help of R2RML. Finally, we converted the documents stored in relational database to RDF datasets with the R2RML parser. [Results] The proposed method could process STKOS metathesaurus automatically and generated 190 million RDF triples. All new records were stored in the Virtuoso database and could be queried with SPARQL. [Limitations] Predicates in the R2RML lacks flexibily, therefore, more complex data sets need to be splited and transformed first. [Conclusions] The proposed model shed light on future research on converting other relational database records or thesaurus to RDF datasets.

    Figures and Tables | References | Related Articles | Metrics
    Predicting Antineoplastic Drug Targets Based on Network Properties
    Fan Xinyue,Cui Lei
    2018, 2 (12): 98-108.  DOI: 10.11925/infotech.2096-3467.2018.0545
    Abstract   HTML ( 10 PDF(2408KB) ( 565 )  

    [Objective] This paper tries to identify potential targets of antineoplastic drugs, aiming to provide references for future clinical work and experiment. [Methods] First, we retrieved the targets of antineoplastic drugs from the DrugBank database, which were also combined with the protein interaction information from the HPRD database. Then, we established the PPI network for these targets with Cytoscape and calculated the topology properties of the nodes. Third, we used SPSS single factor analysis and Weka’s information gain principle to choose the variables for topological attributes. Fourth, we introduced the SMOTE algorithm to process unbalanced data sets and constructed the prediction model for antineoplastic drug targets with the decision tree method. Finally, we compared the performance of our new model with those of the classic ones. [Results] The precision of the proposed model reached 73.18%. With the help of CBioPortal, we found 16 targets’ prediction scores higher than 0.9. These targets could mutate and amplify in various tumors, which were analyzed with the case of NR5A1. [Limitations] The characteristics of target functions, sequence attributes, and other factors should also be included to construct the model. [Conclusions] The proposed model could predict the potential targets of antineoplastic drugs effectively.

    Figures and Tables | References | Related Articles | Metrics
  Copyright © 2016 Data Analysis and Knowledge Discovery   Tel/Fax:(010)82626611-6626,82624938   E-mail:jishu@mail.las.ac.cn