[Objective] This paper reviews the technical solutions for detecting online extremism and radicalization. [Methods] First, we retrieved the needed literature by conducting keyword search with several popular academic databases. Then, we reviewed these papers and summarized their theoretical frameworks, data sources, labelling method, and algorithms. [Results] Researchers have obtained insights from the latest psychology and sociology studies, which helped them refine the detection indicators and methods. The two popular techniques used in this field were based on lexicon method and machine learning algorithm. Although machine-learning methods had the advantages of better accuracy and faster speed, it is very hard for us to construct the training data sets. [Limitations] We did not compare the effectiveness of different solutions. [Conclusions] The reviewed techniques are still developing and more quantitative research is required to analyze the radicalization process. We need to co-operate with sociology and psychology researchers to develop new models and better training data sets.
[Objective] This study modifies Naive Bayes Classifier according to the features of counterterrorism intelligence, aiming to provide a simple and practical way to categorize these data. [Methods] Firstly, we deleted the outliers of terrorism related data, discretized continuous attributes, as well as finished reduction of data with high level correlation. Secondly, we computed conditional probabilities of different attributes. Lastly, we classified new sample dataset based on maximum posteriori hypothesis. [Results] After categorizing the data, we raised probability threshold to partially offset the influence of the data dependence. Only some data of high-level sensitivity needs to be process manually. [Limitations] This method has some restrictions on data independence. In practice, it must be combined with other classification method such as decision tree to cover more intelligence data, and provide information for early warning. [Conclusions] The proposed method, which increases the efficiency of intelligence analysis, is ease of use and has fewer restrictions on the intelligence analysts.
[Objective] This paper tries to predict the locations of suspects based on historical activity trajectory data, aiming to locate, track, monitor or arrest the suspects. [Methods] First, we proposed long short term memory (LSTM) and convolutional neural networks (CNN) models to predict crime locations. Then, we used the CNN model to extrct location features of key suspects and analyze their spatial correlations. Finally, we utlized the LSTM model to maintain the temporal continuity and obtain the future locations. [Results] Compared with previous models, the proposed method increased the prediction accuracy from 0.71 to 0.79 with the trajectory GeoLife dataset. [Limitations] The model was only examined with the Geolife dataset. [Conclusions] The proposed method fully exploits the spatial correlation and temporal continuity of data, which improves the effectiveness of public security intelligence analysis.
[Objective]This paper tries to assess the terrorism risks facing civil aviation industry quantitatively and objectively. [Methods] We proposed a risk assessment model based on K-means clustering, and then examined it with the data of terrorist attacks from 1992 to 2015. We calculated the risk of different types of attacks and their targets objectively. [Results] The risk of aircraft bombing, armed assault against the airport and airline staff were the highest, the risk of hijacking, bombing/explosion aginst the airport or airline staff were at medium level, and the risk of other attacks were relatively low. We used this method to predict the risk of terrorist attacks against the civil aviation in 2016, and the prediction accuracy was up to 92.3%. [Limitations] The proposed method for risk assessment is only suitable for processing numerical data. [Conclusions] The K-means clustering method can assess risk based on statistical data without human intervention, which could be applied to similar studies.
[Objective] This paper conducts risk assessment and decision-making analysis of civil aviation, aiming to address the security challenges facing this industry. [Methods] Based on the risk assessment results of civil aviation, we built the decision tree for civil aviation counter-terrorism, which examined the probabilities, deterrence effect, substitution effect, effectiveness of countermeasures and consequences of the potential civil aviation terrotist attacks. [Results] We evaluated the effects of various countermeasures based on the analysis of the potential terrorist attack threats. [Limitations] Only examined the proposed model with terrorist incidents happened in the past, which is difficult to measure future events. [Conclusions] This paper studies the attributes of possible terrorist attacks against the civil aviation system, including their probabilities, countermeasures, and the consequences.
[Objective] This paper optimizes the deployment of anti-terrorist police resources based on the queueing theory, aiming to improve the effectiveness of counterterrorism actions. [Methods] First, we proposed two optimal anti-terrorist policing strategies based on the M/M/1/∞ and M/M/N/∞ queueing models. Then, we compared the performance of the two models with simulation cases on four factors. [Results] We found that the M/M/N/∞ model had better performance. [Limitations] We did not examine the proposed model with real world anti-terrorism and policing data. [Conclusions] The M/M/N/∞ queueing model could help us create better anti-terrorist policing strategies.
[Objective]This paper proposes a hierarcical classification screening method, aiming to improve the airport security system and passenger experience. [Methods] We proposed a feature deep learning method based on the civil aviation and public security databases. Then, we trained the deep neural network of three layers: time seriers, space seriers, and environmental features to obtain the joint representing feature of passenger risk factors. Finally, we generated the early warning models for passenger security check. [Results] The proposed early warning models could relieve the pressure of civil aviation security check. [Limitaions] More research is needed to examine the proposed model with data from small airports. [Conclusions] The early-warning model based on deep learning could effectively improve work efficiency of airport security checks and passanger experience.
[Objective] This paper proposes a Document Influence Model (DIM) based on Dynamic Automatic Time, aiming to solve the time window dividing issue of dynamic topic model. [Methods] Firstly, we processed the text corpora with the traditional LDA model and word vector model. Secondly, we constructed a comprehensive index reflecting the differences between time windows and similarity within the time windows. Finally, we built a new model based on this index and conducted an empirical study with news corpus of the “Belt and Road” International Cooperation Summit Forum. [Results] The proposed model could quickly and effectively divide the time windows, which not only ensured the comparability of the topics under different windows, but also evaluated the influence factors of the document. [Limitations] We built the similarity index of time windows based on the traditional LDA model, which could be improved by the latest LDA models. [Conclusions] The new model is able to divide the time series text effectively, which improves the performance of traditional dynamic topic model.
[Objective] This paper examines several popular ensemble-learning methods with real-world data, aiming to find the most suitable way to monitor the P2P credit risks facing China. [Methods] We extracted the borrower’s features from five aspects, and identified the most remarkable ones with Random Forest method. Then, we compared the prediction models based on four ensemble-learning methods and five base classifiers. [Results] We found that the Rotation Forest method had the highest accuracy rate of 99.32% and the lowest error rate of 1.71% . Feature selection processing based on Random Forest could improve the performance of all related models significantly. [Limitations] The sample dataset needs to be expanded. [Conclusions] The proposed method could identify credit risks more effectively.
[Objective] This paper presents a new method to recognize metaphor, from the Chinese and English datasets. [Methods] First, we mapped the experimental dataset to vector space, which was also input to a convolutional neural network along with the property and keyword features. Then, we extracted the needed features with the help of convolutional and pooled layers, as well as classified them using SVM. Finally, we combined the Max-Pooling and Mean-Pooling to improve the extracted features’ accuracy. [Results] Compared with the traditional models, our method increased the accuracy of extracted features from the corpus of English verb-object, English adjective-noun and Chinese metaphor by 4.12%, 0.84% and 4.50% respectively. [Limitations] The Chinese word segmentation affects the training of word vector model. We need to add more layers to the convolutional neural networks. [Conclusions] The proposed method could effectively identify metaphor from Chinese and English corpus.
[Objective] This paper compares community detection algorithms in the field of complex network analysis, aiming to support related information science studies. [Methods] First, we identified the similarities and differences of several community detection algorithms (i.e. theoretical frameworks and calculation methods). Then, we examined these algorithms with small data sets. Third, we expanded the sample size, and evaluated the performance of Louvain algorithm, Louvain algorithm with multilevel refinement, and the SLM algorithm with the collaboration and citation networks. [Results] On small dataset, the detection results of GN and FN algorithms were similar, and the results of SLM algorithm were better than those of the Louvain algorithm and Louvain algorithm with multilevel refinement. In the field of library and information science, setting the resolution at 0.5 could help us analyze the detection results. The results of SLM algorithm were different to those of the Louvain algorithm or Louvain algorithm with multilevel refinement. Results of the latter two were almost the same, which were different with the resolution of 1.0. [Limitations] The dataset needs to be expanded. [Conclusions] The Louvain algorithm, Louvain algorithm with multilevel refinement and SLM algorithm are better than traditional algorithms. Among them, the SLM algorithm is the best option for us to analyze the community of citation network.
[Objective] This paper proposes a new method to construct a working sentiment dictionary for sentiment analysis in the field of finance. [Methods] Our method built a sentiment dictionary based on the characteristics of corpus and knowledge base. It also mapped the textual information into vector space using word vector method. With the help of existing general sentiment dictionary, we automatically indexed the training corpus, and created training and forecasting sets with a ratio of 9: 1. Finally, we used Python to establish the neural network classifier of deep learning, and evaluated the emotional polarity of the candidate words in the new dictionary. [Results] The accuracy of the proposed neural network classifier with the training set was 95.02%, while the accuracy with the forecasting set was 95.00%. Our results are better than the existing models. [Limitations] The method of extracting seed words could be further optimized. [Conclusions] The proposed method increases the size of corpus to train the neural network classifiers more effectively. It also extracts the emotion information from the semantic relevance of word vectors. The new sentiment dictionary provides possible directions for future research.
[Objective] This paper proposes a new method to calculate the similarity of science and technology documents combining the information of texts and formulas, aiming to improve the performance of traditional methods. [Methods] Firstly, we mapped feature elements of single formula into position vector, which helped us calculate the similarity of single formula. Secondly, we computed the coverage and similarity of formula between documents. Finally, the similarity of science and technology documents were calculated by combining information of texts and formulas. [Results] We compared the classification results of the new method and the traditional ones. We found that the macro average F-score of the new method was increased by 6.7%. [Limitations] The test sets do not collect formula information of documents, which need to be expanded. [Conclusions] The new method could calculate document similarity more accurately.