Computing Maximal Likelihood Subset Repair for Inconsistent Data

In this paper, we study the problem of subset repair under integrity constraints. For an inconsistent data set, a subset repair removes a minimal set of tuples such that the integrity constraints are no longer violated in the remaining tuples. There usually exist multiple subset repairs and it is difficult to determine which one is optimal. Most previous work prefer the one with minimum number of deleted tuples to avoid excessive removal and information loss. However, it will delete clean tuples and retain dirty tuples when the majority of tuples are dirty in a local scope. We intuitively notice that under a proper model, the correctness probabilities of clean tuples are often larger than that of dirty tuples, and therefore we propose to determine the subset repair with maximum likelihood, which retain tuples with large correctness probability as many as possible. In this paper, we first formalize the maximum likelihood subset repair problem and analyze the hardness. Then we propose a correctness probability model, together with a scalable inference approach. Finally, an efficient approximate algorithm is proposed to compute the maximum likelihood subset repair. Extensive experiments on real-world datasets show that our proposal can achieve higher precision and recall compared with state-of-the-art methods.

Anzhen Zhang, Shengji Hu, Chuanyu Zong, Jiajia Li, Xiufeng Xia

Design of Data Management System for Sustainable Development of Urban Agglomerations’ Ecological Environment Based on Data Lake Architecture

Research on the ecological environment of urban agglomerations plays a crucial role in enhancing environmental quality and ensuring sustainable development. In the research of sustainable development-oriented monitoring and assessing for ecological environments, the management and provision of data have gained significant prominence. However, the characteristics of vast data volumes, diverse data types, and inconsistent metadata descriptions hinder the comprehensive management of ecological environment data in urban agglomerations. This paper aims to investigate the data requirements that are necessary for sustainable development, with a particular focus on unified data management, online data product production and updates, and data service provision. To address these challenges, we have designed a metadata model that is capable of accommodating various types of datasets to facilitate their logical integration. Leveraging the data lake architecture, we have achieved semantic-level data governance driven by relational associations and proposed a data management solution for heterogeneous datasets for the unified management of terabyte-scale datasets. In addition, we have conducted the architectural design for the data management system. The prototype system is also developed to offer comprehensive data services, data product production, and other functionalities such as data visualization and analysis. This study provides extensive data services for monitoring and evaluating activities associated with the Sustainable Development Goals (SDGs) of SDG6, SDG11, and SDG15, while also supporting various application demonstrations and effectively facilitating the sustainable development of urban ecosystems.

Jiabao Li, Wei Han, Xiaohui Huang, Yuewei Wang, Ao Long, Rongrong Duan, Xiaohua Tian, Yuqin Li

P-QALSH+: Exploiting Multiple Cores to Parallelize Query-Aware Locality-Sensitive Hashing on Big Data

Approximate nearest neighbor (ANN) search in high dimensional Euclidean space is a fundamental problem of big data processing. Locality-Sensitive Hashing (LSH) is a popular scheme to solve the ANN search problem. In the index phase, an LSH scheme needs to preprocess multiple hash tables, and in the query phase it exploits the preprocessed hash tables to speedup the ANN search. Query-Aware LSH (QALSH), a state-of-the-art LSH scheme, has rigorous theoretical guarantee on query accuracy, while suffering from high time overhead in the index and query phase. To improve the query efficiency, a multi-core parallel QALSH scheme called P-QALSH was proposed, which is mainly optimized for the query phase. In this paper, we further extend P-QALSH to P-QALSH+, which parallelizes QALSH in both the index and query phases based on multiple cores. Specifically, we first propose a Parallel Table Design to fully accelerate the index construction. Then, we follow P-QALSH to exploit a novel K-Counter Parallel Counting Technology and a novel Search Radius Estimation Strategy to improve the query performance. Using six real-world datasets and eight synthetic datasets, we have performed extensive experiments on a 16-core machine. Experimental results demonstrate the superiority of P-QALSH+ in terms of efficiency of parallel computing. Specifically, compared to QALSH, P-QALSH+ is 10-12X faster on index construction, and achieves 6-8X speedup on query search, and notably shows obvious improvement in query accuracy.

Yikai Huang, Zezhao Hu, Jianlin Feng

Face Super-Resolution via Progressive-Scale Boosting Network

Deep-learning-based face super-resolution (FSR) algorithms have performed more than traditional algorithms. However, existing methods need to pass multi-scale priors effectively constrained models. To alleviate this problem, we propose a progressive-scale boosting network framework, called PBN, which enables the progressive extraction of high-frequency information from low-resolution (LR) to reconstruct high-resolution (HR) face images. To ensure the accuracy of obtaining high-frequency signals, we introduce a constraint from HR to LR, which constructs supervised learning by progressively downsampling the reconstructed image to an LR space. Specifically, we propose a triple-attention fusion block to focus on different local features and prevent the secondary loss of facial structural information by removing the pooling layers. Experiments demonstrate the superior performance of the proposed method quantitatively and qualitatively on three widely used public face datasets (i.e., CelebA, FFHQ, and LFW) compared to existing state-of-the-art methods.

Yiyi Wang, Tao Lu, Jiaming Wang, Aibo Xu

An Investigation of the Effectiveness of Template Protection Methods on Protecting Privacy During Iris Spoof Detection

With the development of iris biometrics, more and more industries and fields begin to apply iris recognition methods. However, as technology advances, attackers try to use printed iris images or artifacts and so on to spoof iris recognition systems. As a result, iris spoof detection is becoming an increasingly important area of research. The employment of spoof detection enhances the security and reliability of iris recognition systems, but an attacker can still subvert the systems by stealing iris data during the spoof detection phase. In this paper, we design a framework called TPISD to solve the issue. TPISD mainly employs template protection methods to protect iris data during the spoof detection phase as well as client to server phase. Specifically, iris data are converted into cancelable and irreversible templates after data capture. These templates are then used to train the spoof detection model. Eventually, during the spoof detection phase, protected templates are used as input, rather than the original iris images. Experiments conducted on CASIA-Syn and CASIA-Interval datasets demonstrate that the application of iris template protection techniques to the spoof detection model may result in a reduction on recognition accuracy, but it can enhance the security of the spoof detection model. This work verifies the feasibility of employing iris template protection methods to protect iris data during the spoof detection.

Baogang Song, Jian Suo, Hucheng Liao, Huanhuan Li, Dongdong Zhao

Stock Volatility Prediction Based on Transformer Model Using Mixed-Frequency Data

With the increasing volume of high-frequency data in the information age, both challenges and opportunities arise in the prediction of stock volatility. On one hand, the outcome of prediction using tradition method combining stock technical and macroeconomic indicators still leaves room for improvement; on the other hand, macroeconomic indicators and peoples’ search record on those search engines affecting their interested topics will intuitively have an impact on the stock volatility. For the convenience of assessment of the influence of these indicators, macroeconomic indicators and stock technical indicators are then grouped into objective factors, while Baidu search indices implying people’s interested topics are defined as subjective factors. To align different frequency data, we introduce GARCH-MIDAS model. After mixing all the above data, we then feed them into Transformer model as part of the training data. Our experiments show that this model outperforms the baselines in terms of mean square error. The adaption of both types of data under Transformer model significantly reduces the mean square error from 1.00 to 0.86.

Wenting Liu, Zhaozhong Gui, Guilin Jiang, Lihua Tang, Lichun Zhou, Wan Leng, Xulong Zhang, Yujiang Liu

A Hierarchy-Based Analysis Approach for Blended Learning: A Case Study with Chinese Students

Blended learning is generally defined as the combination of traditional face-to-face learning and online learning. This learning mode has been widely used in advanced education across the globe due to the COVID-19 pandemic’s social distance restriction as well as the development of technology. Online learning plays an important role in blended learning, and as it requires more student autonomy, the quality of blended learning in advanced education has been a persistent concern. Existing literature offers several elements and frameworks regarding evaluating the quality of blended learning. However, most of them either have different favours for evaluation perspectives or simply offer general guidance for evaluation, reducing the completeness, objectivity and practicalness of related works. In order to carry out a more intuitive and comprehensive evaluation framework, this paper proposes a hierarchy-based analysis approach. Applying gradient boosting model and feature importance evaluation method, this approach mainly analyses student engagement and its three identified dimensions (behavioral engagement, emotional engagement, cognitive engagement) to eliminate some existing stubborn problems when it comes to blended learning evaluation. The results show that cognitive engagement and emotional engagement play a more important role in blended learning evaluation, implying that these two should be considered to improve for better learning as well as teaching quality.

Yu Ye, Gongjin Zhang, Hongbiao Si, Liang Xu, Shenghua Hu, Yong Li, Xulong Zhang, Kaiyu Hu, Fangzhou Ye

A Multi-teacher Knowledge Distillation Framework for Distantly Supervised Relation Extraction with Flexible Temperature

Distantly supervised relation extraction (DSRE) generates large-scale annotated data by aligning unstructured text with knowledge bases. However, automatic construction methods cause a substantial number of incorrect annotations, thereby introducing noise into the training process. Most sentence-level relation extraction methods rely on filters to remove noise instances, meanwhile, they ignore some useful information in negative instances. To effectively reduce noise interference, we propose a Multi-teacher Knowledge Distillation framework for Relation Extraction (MKDRE) to extract semantic relations from noisy data based on both global information and local information. MKDRE addresses two main problems: the deviation in knowledge propagation of a single teacher and the limitation of traditional distillation temperature on information utilization. Specifically, we utilize flexible temperature regulation (FTR) to adjust the temperature assigned to each training instance, so as to dynamically capture local relations between instances. Furthermore, we introduce information entropy of hidden layers to gain stable temperature calculations. Finally, we propose multi-view knowledge distillation (MVKD) to express global relations among teachers from various perspectives to gain more reliable knowledge. The experimental results on NYT19-1.0 and NYT19-2.0 datasets show that our proposed MKDRE significantly outperforms previous methods in sentence-level relation extraction.

Hongxiao Fei, Yangying Tan, Wenti Huang, Jun Long, Jincai Huang, Liu Yang

PAEE: Parameter-Efficient and Data-Effective Image Captioning Model with Knowledge Prompter and Cross-Modal Representation Aligner

Large-scale pre-trained models and research on massive data have achieved state-of-the-art results in image captioning technology. However, the high cost of pre-training and fine-tuning has become a significant issue that needs to be considered. In this paper, we propose PAEE, a parameter-efficient and data-effective image captioning model that generates captions based on the input image encoding and the knowledge obtained from the newly introduced Knowledge Prompter. In PAEE, the only module that needs to be learned is the Cross-modal Representation Aligner (CRA) introduced between the visual encoder and language decoder, which facilitates the language model’s better adaptation to visual representation. The entire model greatly reduces the cost of pre-training and fine-tuning. Extensive experiments demonstrate that PAEE maintains competitive performance compared to large-scale pre-trained models and similar approaches, while reducing the number of trainable parameters. We design two new datasets to explore the data utilization ability of PAEE and discover that it can effectively use new data and achieve domain transfer without any training or fine-tuning. Additionally, we introduce the concept of $$small -data$$ s m a l l - d a t a learning and find that PAEE has data-effective characteristics in limited computing resources and performs well even with fewer training samples.

Yunji Tian, Zhiming Liu, Quan Zou, Geng Chen

TSKE: Two-Stream Knowledge Embedding for Cyberspace Security

Knowledge representation models have been extensively studied and adopted in many areas such as search, recommendation, etc. However, due to the highly spatio-temporal relevant characteristics of cyberspace security and the dynamic variability of the domain knowledge, the existing models and knowledge embedding methods cannot be adopted in this field directly. In this paper, we propose a two-stream knowledge embedding (TSKE) method for cyberspace security to jointly embed multi-dimensional characteristics. Specifically, we design a static stream neural network and a spatio-temporal stream neural network to extract the static knowledge and the spatio-temporal features of cyberspace security facts, which converts this domain knowledge into vector space. Considering the attack link prediction task in the field of cyberspace security, we conduct extensive experiments and TSKE outperforms other static and dynamic embedding methods.

Angxiao Zhao, Haiyan Wang, Junjian Zhang, Yunhui Liu, Changchang Ma, Zhaoquan Gu

Research on the Impact of Executive Shareholding on New Investment in Enterprises Based on Multivariable Linear Regression Model

Based on principal-agent theory and optimal contract theory, companies use the method of increasing executives’ shareholding to stimulate collaborative innovation. However, from the aspect of agency costs between management and shareholders (i.e. the first type) and between major shareholders and minority shareholders (i.e. the second type), the interests of management, shareholders and creditors will be unbalanced with the change of the marginal utility of executive equity incentives. In order to establish the correlation between the proportion of shares held by executives and investments in corporate innovation, we have chosen a range of publicly listed companies within China’s A-share market as the focus of our study. Employing a multi-variable linear regression model, we aim to analyze this relationship thoroughly.The following models were developed: (1) the impact model of executive shareholding on corporate innovation investment; (2) the impact model of executive shareholding on two types of agency costs; (3)The model is employed to examine the mediating influence of the two categories of agency costs. Following both correlation and regression analyses, the findings confirm a meaningful and positive correlation between executives’ shareholding and the augmentation of corporate innovation investments. Additionally, the results indicate that executive shareholding contributes to the reduction of the first type of agency cost, thereby fostering corporate innovation investment. However, simultaneously, it leads to an escalation in the second type of agency cost, thus impeding corporate innovation investment.

Shanyi Zhou, Ning Yan, Zhijun Li, Mo Geng, Xulong Zhang, Hongbiao Si, Lihua Tang, Wenyuan Sun, Longda Zhang, Yi Cao

MCNet: A Multi-scale and Cascade Network for Semantic Segmentation of Remote Sensing Images

High resolution remote sensing images that can show more detailed ground information play an important role in land classification. However, existing segmentation methods have the problems of insufficient use of multi-scale feature and semantic information. In this study, a multi-scale and cascade semantic segmentation network (MCNet) was proposed and tested on the Potsdam and Vaihingen datasets. (1) Multi-scale feature extraction module: using dilated convolution and a parallel structure to fully extract multi-scale feature information. (2) Cross-layer feature selection module: adaptively selecting features in different levels to avoid the loss of key features. (3) Multi-scale object guidance module: weighting the features at different scales to express the multi-scale ground objects. (4) Cascade structure in the decoder part: increasing the information flow and enhancing the decoding capability of the network. Results show that the proposed MCNet outperformed the baseline networks, achieving an average overall accuracy of 86.91% and 87.82% on the two datasets, respectively. In conclusion, the multi-scale and cascade semantic segmentation network can improve the accuracy of land cover classification by using remote sensing images.

Yin Zhou, Tianyi Li, Xianju Li, Ruyi Feng

WikiCPRL: A Weakly Supervised Approach for Wikipedia Concept Prerequisite Relation Learning

Concept prerequisite relations determine the order in which knowledge concept is learned. This kind of concept relations has been used in a variety of educational applications, such as curriculum planning, learning resource sequencing, and reading list generation. Manually annotating prerequisite relations is time-consuming. Besides, data annotated by multiple people is often inconsistent. These factors have led to significant limitations in the use of supervised concept prerequisite learning methods. In this paper, we propose a weakly supervised Wikipedia Concept Prerequisite Relations Learning approach, called WikiCPRL, to identify prerequisite relations between Wikipedia concepts. First of all, we take the title of each Wikipedia article in a domain as a concept, and employ the RefD algorithm to generate weak labels for all the concept pairs, and then build a concept map for the domain. Secondly, a graph attention layer is defined to fuse the context information of each concept in the concept map so as to update their feature representations. Finally, we use the VGAE model to reconstruct the concept map, and then obtain the concept prerequisite graph. Extensive experiments on both English and Chinese datasets demonstrate that the proposed approach can achieve the same performance as several existing supervised learning methods.

Kui Xiao, Kun Li, Yan Zhang, Xiang Chen, Yuanyuan Lou

An Effective Privacy-Preserving and Enhanced Dummy Location Scheme for Semi-trusted Third Parties

Location-Based Services (LBS) have garnered significant attention in recent years, emphasizing the need to improve location services while safeguarding user privacy. In this paper, we propose an effective privacy-preserving and enhanced dummy location scheme specifically designed for semi-trusted third-party scenarios, with a primary focus on defending against inference attacks targeting a user’s private location information. To achieve more effective location privacy preservation and mitigate privacy leaks stemming from a single point of failure, we employ a key information sharing mechanism, introduce a robust dummy location set generation approach, and present a comprehensive covering area construction strategy. To demonstrate the viability and effectiveness of our proposed scheme, we conduct a thorough simulation evaluation and performance analysis based on a practical road network setting.

Meijing Zuo, Luyao Peng, Jun Song

W-MRI: A Multi-output Residual Integration Model for Global Weather Forecasting

Weather forecasting refers to the process in which science and technology are applied to predict the conditions of the atmosphere for a given time. In this paper, we present W-MRI, a multi-output residual integration model for global weather forecasting. We introduce residual mechanism into weather prediction, which can simulate changes in weather conditions, and elaborately design a residual network to integrate and constrain multi-output residuals. W-MRI can effectively extract the features of meteorological data, capture their internal relations, and make fast and accurate forecasts of multiple meteorological variables, such as surface wind speed and precipitation. We use the fifth-generation ECMWF Re-Analysis (ERA5) data to train W-MRI, with samples selected every six hours. Importantly, our proposed W-MRI outperforms FourCastNet in multi-variable weather forecasting under the same experimental conditions. Moreover, experiments show that our model has a stable and significant advantage in short-to-medium-range forecasting, and the longer the forecasting time-step, the more obvious the performance advantage of W-MRI, showing that the residual network has great advantages in weather forecasting.

Lihao Gan, Xin Man, Changyu Li, Lei She, Jie Shao

HV-Net: Coarse-to-Fine Feature Guidance for Object Detection in Rainy Weather

Object detection algorithms have been extensively researched in the field of computer vision, but they are still far from being perfect, especially in adverse weather conditions such as rainy weather. Traditional object detection models suffer from an inherent limitation when extracting features in adverse weather due to domain shift and weather noise, leading to feature contamination and a significant drop in model performance. In this paper, we propose a novel staged detection paradigm inspired by the human visual system, called Human Vision Network (HV-Net). HV-Net first extracts coarse-grained edge features and leverages their insensitivity to weather noise to reduce feature contamination and outline the edges of large and medium-sized objects. The subsequent network uses deep fine-grained features and edge-attentional features to generate clear images, enhancing the understanding of small objects that might be missed in edge detection and mitigating weather noise. The staged end-to-end pipeline design allows the clear features to be shared throughout the network. We validate the proposed method for rainy weather object detection on both real-world and synthetic datasets. Experimental results demonstrate significant improvements of our HV-Net compared to baselines and other object detection algorithms.

Kaiwen Zhang, Xuefeng Yan

Vehicle Collision Warning System for Blind Zone in Curved Roads Based on the Spatial-Temporal Correlation of Coordinate

Traffic safety has been an important research topic in intelligent transportation, especially the special terrain of mountainous areas, which increases the traffic accident rate. The main contribution of this paper is to propose a warning system for vehicles meeting in blind zones of mountain roads with low-cost, stable and reliable communication, and high accuracy data prediction. Among them, a corner-matched tracking algorithm based on special blocks and a bidirectional traffic estimation strategy based on coordinate correlation in spatiotemporal space were designed for the first time, providing reliable judgment information for the warning system. Moreover, communication methods without network environment is applied to the proposed system, solving the problem of weak network infrastructure in mountainous areas. Finally, the application performance shows that our system and its algorithm have sufficient robustness under complex weather conditions.

Qiao Meng, Xinli Li, Yu’an Zhang, Junyi Huangfu

Local-Global Cross-Fusion Transformer Network for Facial Expression Recognition

Facial Expression Recognition (FER) has received increasing attention in the computer vision community. For FER, there are two challenging issues among the facial images: large inter-class similarity and small intra-class discrepancy. To address these challenges and obtain a better performance, we propose a Local-Global Cross-Fusion Transformer network in this paper. Specifically, the method seeks to obtain a more discriminative facial representation by sufficiently considering the local features of multiple local regions of the face and global face features. In order to extract the critical local area features of the face, a local feature decomposition module based on facial landmarks is designed. In addition, a local-global cross-fusion Transformer is designed to enhance the synergistic correlation between local features and global features using the cross-attention mechanism, which can maximize the focus on key regions while considering the connection information among local regions. Extensive experiments conducted on three mainstream expression recognition datasets, RAF-DB, FERPlus, and AffectNet, show that the method outperforms many existing expression recognition methods and can significantly improve the accuracy of expression recognition.

Yicheng Liu, Zecheng Li, Yanbo Zhang, Jie Wen

Answering Spatial Commonsense Questions by Learning Domain-Invariant Generalization Knowledge

Existing spatial commonsense reading comprehension (SCRC) systems struggle to answer questions from unknown domains or any out-of-domain distributions, which prevents them from being deployed in real applications. Unsupervised domain adaptation (UDA) in QA has emerged as a major approach to address this challenge. However, existing methods mainly rely on generating synthetic data and pseudo-labeling target domains, which not only consumes extra computational resources but also places high demands on noise filtering of the generated data. To tackle these problems, we propose a UDA framework, called LEGRN-DIG, for spatial commonsense question answering using unlabeled target domain data. This framework avoids the use of labeled or pseudo-labeled target instances and noisy synthetic data. We apply domain-invariant generalization learning to integrate features of the target domain into the source domain and still use the source domain for supervised training. Extensive experiments are conducted to illustrate the effectiveness and robustness of our model.

Miaopei Lin, Jianxing Yu, Shiqi Wang, Hanjiang Lai, Wei Liu, Jian Yin

Global and Local Structure Discrimination for Effective and Robust Outlier Detection

Deep outlier detection on high dimensional data is an important research problem with critical applications in many areas. Though promising performance has been demonstrated, we observe that existing methods characterized outliers only from a single perspective, which leads to reducing the distinction of inliers/outliers with the growth of training epochs. This in turn hurts the robustness and effectiveness of outlier detection since the optimal training epoch on a special dataset is unknown in unsupervised scenarios. In this paper, we propose a DNN based framework with both global and local structure discrimination for effective and robust Outlier Detection, named GOOD. The global module compacts the data (mainly inliers) since the majority of data are inliers, while the local module scatters the data (mainly outliers) based on that outliers reside in low-probability density areas. These two modules are cleverly united by a self-adaptive weighting strategy that trades off the degree of complementary and competitive cooperation. The complementary views can help effectively detect outliers with diverse characteristics, and such competitive learning can prevent a single module from learning the entire data too well and ensure robust detection performance. Comprehensive experimental studies on datasets from diverse domains show that GOOD significantly outperforms state-of-the-art methods by up to 30 $$\%$$ % improvement of AUC while performing much more robustly with the growth of training epochs.

Canmei Huang, Li Cheng, Feng Yao, Renjie He

A Situation Knowledge Graph Construction Mechanism with Context-Aware Services for Smart Cockpit

With the continuous development of intelligence and network connectivity, the smart cockpit gradually transforms into a multifunctional value space. Smart devices are heterogeneous, massive, complex, and contextually dynamic, which makes the services provided by the system inaccurate. Introducing knowledge graphs in smart cockpit situations can meet users’ needs in specific scenarios while delivering experiences that exceed expectations. This paper constructs a smart cockpit situation model with context, service, and user as the core elements, not only refining the context dimension but also incorporating context into the definition of service. Firstly, we analyze the elements that constitute the smart cockpit situation model and explore the connection between them. Secondly, a top-down approach is used to construct the smart cockpit situation ontology using the smart cockpit situation model as a guide. Finally, the smart cockpit situation model is instantiated to build a knowledge graph for fitness scenarios. The research results show that the coverage relationships between scenarios are inferred based on the coverage relationships between contexts. Furthermore, we verify the context can improve the accuracy of the service with a family travel scenario example. The situation knowledge graph constructed in this paper cannot only comprehensively describe the smart cockpit scene data, but also the service can adapt to the dynamic changes of contextual data.

Xinyi Sheng, Jinguang Gu, Xiaoyu Yang

A Task-Oriented Multi-turn Dialogue Mechanism for the Smart Cockpit

As an important carrier and platform for AI applications, the dialogue system for smart cockpits will become an important scenario for AI applications. However, there are some problems with the existing smart cockpit dialogue system, such as a lack of relevant corpus and knowledge annotations. These problems lead to the low accuracy of response representation generated by task-oriented dialogue systems for smart cockpits, which cannot meet the needs of smart human-computer interaction. To this end, this paper designs cockpitWOZ, a multi-round Chinese conversation dataset with knowledge annotation in the smart cockpit domain, which contains 2.9k conversations in four domains, including restaurants, attractions, music, and itineraries. On this basis, a new knowledge-driven task-based dialogue model is designed in this paper based on multiple baseline models. Firstly, a knowledge graph named cockpitKG based on the smart cockpit scenario is constructed to enhance the responsiveness of the model. Secondly, the smart cockpit system is built and a label replacement method is used to meet the requirements for real-time interaction during vehicle movement. Finally, the experimental results show that this model outperforms the baseline model and demonstrates that the introduction of background knowledge and label replacement can lead to higher-quality responses in the smart cockpit dialogue system.

Xiaoyu Yang, Xinyi Sheng, Jinguang Gu

MEOM: Memory-Efficient Online Meta-recommender for Cold-Start Recommendation

Online recommender systems aim to provide timely recommended results by constantly updating the model with new interactions. However, existing methods require sufficient personal data and fail to accurately support online recommendations for cold-start users. Although the state-of-the-art method adopts meta-learning to solve this problem, it requires to recall previously seen data for model update, which is impractical due to linearly increasing memory over time. In this paper, we propose a memory-efficient online meta-recommender MEOM that can avoid the explicit use of historical data while achieving high accuracy. The recommender adopts MAML as a meta-learner and particularly adapts it to online scenarios with effective regularization. Specifically, an online regularization method is designed to summarize the time-varying model and historical task gradients, such that overall model optimization direction can be acquired to parameterize a meaningful regularizer as a penalty for next round model update. The regularizer is then utilized to guide model updates with prior knowledge in a memory-efficient and accurate manner. Besides, to avoid task-overfitting, an adaptive learning rate strategy is adopted to control model adaptation by more suitable learning rates in dual levels. Experimental results on two real-world datasets show that our method can significantly reduce memory consumption while keeping accuracy performance.

Yan Luo, Ruoqian Zhang

A Social Bot Detection Method Using Multi-features Fusion and Model Optimization Strategy

Online Social Networks (OSNs) have become an indispensable part of our lives, providing a platform for users to access and share information. However, the emergence of malicious social bots has disrupted the normal functioning of OSNs, posing a threat to their healthy development. With the evolution of social bots making them increasingly difficult to distinguish from human users, social bot detection has become a significant challenge. The difficulty lies in constructing effective features that cater to detect multiple types of social bots, performance of a single model in detecting social bots, and handling the imbalanced distribution of social bots and human users in real environments. To address these challenges, this paper proposes a social bot detection method based on multi-features fusion and model optimization strategy. The proposed method analyzes differences in user profile, tweets content, temporal information, and activity behaviors to extract and fuse effective features. Weighted soft voting mechanism and a searching best detection threshold strategy are creatively introduced to improve the performance of model. The superiority of our method is confirmed on four real datasets. The effectiveness of different dimensional features on detecting social bots is also analyzed. Furthermore, our method achieves better performance on imbalanced datasets, indicating great robustness and its ability to detect social bots in real environments.

Xiaohui Huang, Shudong Li, Weihong Han, Shumei Li, Yanchen Xu, Zikang Liu

Benefit from AMR: Image Captioning with Explicit Relations and Endogenous Knowledge

Recent advanced image captioning methods mostly explore implicit relationships among objects by object-based visual feature modeling, while failing to capture the explicit relations and achieve semantic association. To tackle these problems, we present a novel method based on Abstract Meaning Representation (AMR) in this paper. Specifically, in addition to implicit relationship modeling of visual features, we design an AMR generator to extract explicit relations of images and further model these relations during generation. Besides, we construct an AMR-based endogenous knowledge graph, which helps extract prior knowledge for semantic association, strengthening the semantic expression ability of the captioning model without any external resources. Extensive experiments are conducted on the public MS COCO dataset, and results show that the AMR-based explicit semantic features and the associated semantic features can further boost image captioning to generate higher-quality captions.

Feng Chen, Xinyi Li, Jintao Tang, Shasha Li, Ting Wang

SSCAN:Structural Graph Clustering on Signed Networks

Structural graph clustering ( $$\textsf{SCAN}$$ SCAN ) is a foundational problem about managing and profiling graph datasets, which is widely experienced across many realistic scenarios. Due to existing work on structural graph clustering focused on unsigned graphs, existing $$\textsf{SCAN}$$ SCAN methods are not applicable to signed networks that can indicate friendly and antagonistic relationships. To tackle this problem, we investigate a novel structural graph clustering model, named $$\textsf{SSCAN}$$ SSCAN . On the basis of $$\textsf{SSCAN}$$ SSCAN , we propose an online approach that can efficiently compute the clusters for a given signed network. Furthermore, we also devise an efficient index structure, called $$\mathsf {SSCAN\text {-}Index^{+}}$$ SSCAN - Index + , which stores information about core vertices and structural similarities. The size of our proposed index can be well bounded by O(m), where m is the total amount of edges in an input signed network. Following the new index $$\mathsf {SSCAN\text {-}Index^{+}}$$ SSCAN - Index + , we develop an index-based query method designed to avoid invalid scans of the entire network. Extensive experimental testings on eight real signed networks prove the effectiveness and efficiency of our proposed methods.

Zheng Zhao, Wei Li, Xiangxu Meng, Xiao Wang, Hongwu Lv

ANSWER: Automatic Index Selector for Knowledge Graphs

Efficient access to knowledge graphs is identified as the basic premise to make full use of knowledge graphs. Since the query processing efficiency is mainly affected by index configuration, it is necessary to create effective indexes for knowledge graphs. However, none of existing studies of index selection focuses on the characteristics of knowledge graphs. To fill this gap, we propose an automatic index selector for knowledge graphs based on reinforcement learning, named ANSWER, to select an appropriate index configuration according to the historical workloads. However, it is challenging a learn a well-trained index selection model due to the large action space of reinforcement learning model and the requirement of lightweight embedding strategies. To address this problem, we first develop a novel predicate filter, which not only determines which vertical partitioning tables are valuable to create indexes, but also reduces the action space of model. Based on the filtered predicates, we derive an effective and lightweight encoder to not only embed the main features of workloads into the model, but also guarantee the high-efficiency of ANSWER. Experimental results on real-world knowledge graphs demonstrate the effectiveness of ANSWER in terms of knowledge graph query processing.

Zhixin Qi, Haoran Zhang, Hongzhi Wang, Zemin Chao

A Long-Tail Relation Extraction Model Based on Dependency Path and Relation Graph Embedding

Distant supervision, a method for relation extraction, leverages knowledge base triples to label entities and relations in text, but this leads to noisy labels and long-tail problems. Among long-tail dependency structures, the hierarchy tree of relations is the most classical and has demonstrated great efficacy in information extraction. However, the hierarchical tree of relations presents a challenge in obtaining sufficient information representation in cases where there is no sibling node or parent node without sibling node. To address this challenge, the use of constraint graphs has been proposed, but such approaches neglect the hierarchical information in the relations. To overcome this limitation, we propose a model based on dependency paths and relational graph embeddings. The model utilizes two relational graph structures, the constraint graph and the relation hierarchy tree, for relation learning, with the aim of transferring the knowledge learned in the data-rich relation to the long-tail relation. Additionally, the model leverages the shortest dependency path between entity pairs to increase the discriminative power of entity pairs in different bags for multi-instance learning. Experimental results show that the model achieves an AUC of 54.3% on the NYT-10 dataset and 86.3% on Hit@15 (<100).

Yifan Li, Yanxiang Zong, Wen Sun, Qingqiang Wu, Qingqi Hong

Multi-token Fusion Framework for Multimodal Sentiment Analysis

In this paper, we design a multi-token fusion (MTF) framework to process inter-modality and intra-modality information in parallel for multimodal sentiment analysis. Specifically, a tri-token transformer (TT) module is proposed to extract three tokens from each modality where one of them retains the unimodal feature and the other two tokens learn multi-modal features from the other two modalities respectively. Furthermore, a module based on the hierarchical element-wise self-attention (HESA) is used to process the three tokens of each modality extracted by TT. As a result, the important elements of tokens will be given more attention. Finally, we conduct extensive experiments on two public datasets, which prove the effectiveness and scalability of our network.

Zhihui Long, Huan Deng, Zhenguo Yang, Wenyin Liu

Generative Adversarial Networks Based on Contrastive Learning for Sequential Recommendation

Generative Adversarial Networks(GAN) has made key breakthroughs in computer vision and other fields, so some scholars have tried to apply it to sequential recommendation. However, the recommendation performance of GAN-based algorithms is unsatisfactory. The reason for this is that the discriminator cannot distinguish the original data from the generated data well if it only relies on the target function. Based on this, we propose Generative Adversarial Networks based on Contrastive Learning for sequential recommendation (shortened to CtrGAN). Firstly, the generator generates item sequences that the user may be interested in. Additionally, the true item sequences of the user are subjected to a mask operation, which means that the sequences with mask operation are fake. Therefore, both generative sequences and fake sequences can be used in Contrastive Learning to train the generator. The true sequences and their mask operations are then combined with the generative sequences to employ the discriminator for distinguishing them. Finally, the contrastive loss and discriminative loss are combined to guide the generator to generate item sequences that the user may be interested in. Compared with existing sequential recommendation algorithms, experimental results illustrate that CtrGAN has better recommendation accuracy.

Li Jianhong, Wang Yue, Yan Taotao, Sun Chengyuan, Li Dequan

Multimodal Stock Price Forecasting Using Attention Mechanism Based on Multi-Task Learning

This paper proposes a Multi-Task Attention-based Stock Prediction Model (MTASPM) to tackle the challenges of stock price forecasting in the Chinese market, characterized by solid volatility and numerous influencing factors. Employing multimodal information from stock correlation, historical trading data, company news, and government policies, MTASPM leverages multi-task learning and attention mechanism to enhance predictive accuracy and capture data patterns for stock price forecasting. Experimental results on the Shanghai Exchange Stock Price Dataset (SHESPD) and the Shenzhen Exchange Stock Price Dataset (SZESPD) demonstrate that MTASPM outperforms eight baseline models. Specifically, MTASPM achieves improvements of 42.16% in MSE, 25.18% in RMSE, and 6.88% in MAE on SHESPD, and improvements of 16.95% in MSE, 8.64% in RMSE, and 6.12% in MAE on SZESPD. Overall, this study presents an effective approach for accurate stock price prediction that considers multiple influencing factors and utilizes multimodal information.

Haoyan Yang

Federated Trajectory Search via a Lightweight Similarity Computation Framework

Contact tracing is one of the most effective ways of disease control during a pandemic. A typical method for contact tracing is to examine the spatio-temporal companion between the trajectories of patients and others. However, human trajectory data collected by mobile devices cannot be directly shared due to privacy. To utilize personal trajectory data in contact tracing, this paper presents a federated trajectory search engine called Fetra, which can efficiently process top-k search over a data federation composed of numerous mobile devices without uploading raw trajectories. To achieve this, we first propose a lightweight similarity measure LCTS based on spatio-temporal companion time to evaluate the similarity between trajectories. We then build a federated grid index named FGI via location anonymization. Given a query, a pruning strategy over FGI is applied to prune the candidate mobile devices dynamically. In addition, we propose a local optimization strategy to accelerate similarity computations in mobile devices. Extensive experiments on real-world dataset verify the effectiveness of LCTS and the efficiency of Fetra.

Chen Wu, Zhiyong Peng

Central Similarity Multi-view Hashing for Multimedia Retrieval

Hash representation learning of multi-view heterogeneous data is the key to improving the accuracy of multimedia retrieval. However, existing methods utilize local similarity and fall short of deeply fusing the multi-view features, resulting in poor retrieval accuracy. Current methods only use local similarity to train their model. These methods ignore global similarity. Furthermore, most recent works fuse the multi-view features via a weighted sum or concatenation. We contend that these fusion methods are insufficient for capturing the interaction between various views. We present a novel Central Similarity Multi-View Hashing (CSMVH) method to address the mentioned problems. Central similarity learning is used for solving the local similarity problem, which can utilize the global similarity between the hash center and samples. We present copious empirical data demonstrating the superiority of gate-based fusion over conventional approaches. On the MS COCO and NUS-WIDE, the proposed CSMVH performs better than the state-of-the-art methods by a large margin (up to $$11.41\%$$ 11.41 % mean Average Precision (mAP) improvement).

Jian Zhu, Wen Cheng, Yu Cui, Chang Tang, Yuyang Dai, Yong Li, Lingfang Zeng

Entity Alignment Based on Multi-view Interaction Model in Vulnerability Knowledge Graphs

Entity alignment (EA) aims to match the same entities in different Knowledge Graphs (KGs), which is a critical task in KG fusion. EA has recently attracted the attention of many researchers, but the performance of general methods on KGs in some professional fields is not satisfactory. Vulnerability KG is a kind of KG that stores vulnerability knowledge. The text and structure information is not the same as the general KG, so the EA task faces unique challenges. First, although some vulnerabilities have a unified CVE number, in reality, the CVE number attribute value of many vulnerability entities in KG is missing. Second, vulnerability KGs often contain a large number of 1−n and n−1 relations, and general entity embedding methods may generate similar vector representations for a large number of non-identical vulnerabilities. To address the above challenges, we propose a multi-view text-graph interaction model (TG-INT) for the EA task in vulnerability KG. We use cross-lingual BERT to learn text embeddings and an optimized model called QuatAE to embed two graphs into a unified vector space. After that, we employed a multi-view interactive modeling scheme for the EA task. On the vulnerability KGs built on the vulnerability database CNNVD and CNVD, we verified the effectiveness of TG-INT. The results show that our model is not only suitable for vulnerability KGs but also achieves promising results in general KGs.

Jin Jiang, Mohan Li

Springer Professional

Über dieses Buch

Inhaltsverzeichnis

Frontmatter