Computer Animation

Frontmatter

Temporal Parameter-Free Deep Skinning of Animated Meshes

In computer graphics, animation compression is essential for efficient storage, streaming and reproduction of animated meshes. Previous work has presented efficient techniques for compression by deriving skinning transformations and weights using clustering of vertices based on geometric features of vertices over time. In this work we present a novel approach that assigns vertices to bone-influenced clusters and derives weights using deep learning through a training set that consists of pairs of vertex trajectories (temporal vertex sequences) and the corresponding weights drawn from fully rigged animated characters. The approximation error of the resulting linear blend skinning scheme is significantly lower than the error of competent previous methods by producing at the same time a minimal number of bones. Furthermore, the optimal set of transformation and vertices is derived in fewer iterations due to the better initial positioning in the multidimensional variable space. Our method requires no parameters to be determined or tuned by the user during the entire process of compressing a mesh animation sequence.

Anastasia Moutafidou, Vasileios Toulatzis, Ioannis Fudos

The Impact of Animations in the Perception of a Simulated Crowd

Simulating virtual crowds is an important challenge in many areas such as games and virtual reality applications. A lot of effort has been dedicated to improving pathfinding, collision avoidance, or decision making, to achieve more realistic human-like behavior. However, crowd simulation will be far from appearing realistic as long as virtual humans are limited to walking animations. Including animation variety could greatly enhance the plausibility of the populated environment. In this paper, we evaluated to what extend animation variety can affect the perceived level of realism of a crowd, regardless of the appearance of the virtual agents (bots vs. humanoids). The goal of this study is to provide recommendations for crowd animation and rendering when simulating crowds. Our results show that the perceived realism of the crowd trajectories and animations is significantly higher when using a variety of animations as opposed to simply having locomotion animations, but only if we render realistic humanoids. If we can only render agents as bots, then there is no much gain from having animation variety, in fact, it could potentially lower the perceived quality of the trajectories.

Elena Molina, Alejandro Ríos, Nuria Pelechano

Computer Vision

Frontmatter

Virtual Haptic System for Shape Recognition Based on Local Curvatures

Haptic object recognition is widely used in various robotic manipulation tasks. Using the shape features obtained at either a local or global scale, robotic systems can identify objects solely by touch. Most of the existing work on haptic systems either utilizes a robotic arm with end-effectors to identify the shape of an object based on contact points, or uses a surface capable of recording pressure patterns. In this work, we introduce a novel haptic capture system based on the local curvature of an object. We present a haptic sensor system comprising of three aligned and equally spaced fingers that move towards the surface of an object at the same speed. When an object is touched, our system records the relative times between each contact sensor. Simulating our approach in a virtual environment, we show that this new local and low-dimensional geometrical feature can be effectively used for shape recognition. Even with 10 samples, our system achieves an accuracy of over $$90\%$$ 90 % without using any sampling strategy or any associated spatial information.

Guillem Garrofé, Carlota Parés, Anna Gutiérrez, Conrado Ruiz Jr, Gerard Serra, David Miralles

Stable Depth Estimation Within Consecutive Video Frames

Deep learning based depth estimation methods have been proven effective and promising, especially learning depth from monocular video. Depth-from-video is the real sense of unsupervised depth estimation, as it doesn’t need depth ground truth or stereo image pairs as supervision. However, most of existing depth-from-video methods did not think of frame-to-frame depth estimation stability. We found depths within temporally consecutive frames exist instability although single image depth can be estimated well by recent works. Thus, this work aims to solve this problem. Specifically, we define a temporal smoothness term for the depth map and propose a temporal stability loss to constrain depths of the same objects within consecutive frames to keep their stability. We also propose an inconsistency check processing according to the differences between synthetic view frames and their original RGB frame. Based on the inconsistency check, we propose a self-discovered mask to handle the moving and occluded objects. Experiments show that the proposed method is effective and can estimate stable depth results within temporally consecutive frames. Meanwhile, it achieves competitive performance on the KITTI dataset.

Fei Luo, Lin Wei, Chunxia Xiao

Progressive Multi-scale Reconstruction for Guided Depth Map Super-Resolution via Deep Residual Gate Fusion Network

Depth maps obtained by consumer depth sensors are often accompanied by two challenging problems: low spatial resolution and insufficient quality, which greatly limit the potential applications of depth images. To overcome these shortcomings, some depth map super-resolution (DSR) methods tend to extrapolate a high-resolution depth map from a low-resolution depth map with the additional guidance of the corresponding high-resolution intensity image. However, these methods are still prone to texture copying and boundary discontinuities due to improper guidance. In this paper, we propose a deep residual gate fusion network (DRGFN) for guided depth map super-resolution with progressive multi-scale reconstruction. To alleviate the misalignment between color images and depth maps, DRGFN applies a color-guided gate fusion module to acquire content-adaptive attention for better fusing the color and depth features. To focus on restoring details such as boundaries, DRGFN applies a residual attention module to highlight the different importance of different channels. Furthermore, DRGFN applies a multi-scale fusion reconstruction module to make use of multi-scale information for better image reconstruction. Quantitative and qualitative experiments on several benchmarks fully show that DRGFN obtains the state-of-the-art performance for depth map super-resolution.

Yang Wen, Jihong Wang, Zhen Li, Bin Sheng, Ping Li, Xiaoyu Chi, Lijuan Mao

SE_EDNet: A Robust Manipulated Faces Detection Algorithm

Face manipulation techniques have raised concern over potential threats, which demand effective images forensic methods. Various approaches have been proposed, but when detecting higher-quality manipulated faces, the performance of previous method is not good enough. To prevent the abuse of these techniques and improve the detection ability, this paper proposes a new algorithm named Squeeze-Excitation Euclidean Distance Network (SE_EDNet) to detect manipulated faces, which is suitable for Deepfakes and GANs detection. SE_EDNet use Euclidean distance to describe similaity of vectors, which gives higher weights to important areas than traditional self-attention mechanism. Further, we take frequency into account and extract residuals information, which are obtained by a second-order filter. Then residuals are combined with original images as the input features for the network. Comparison experiment shows SE_EDNet performs better than existing algorithms. Extensive robustness experiments on Celeb-DF and DFFD demonstrate that proposed algorithm is robust against attacking on AUC scores and Recalls.

Chaoyang Peng, Lihong Yao, Tanfeng Sun, Xinghao Jiang, Zhongjie Mi

PointCNN-Based Individual Tree Detection Using LiDAR Point Clouds

Due to the rapid development of deep learning technology in recent years, many scholars have applied deep learning technology to the field of remote sensing imagery. But few have directly applied LiDAR point clouds to 3D neural networks for tree detection. And the existing methods usually have better detection results in a specific single scene, but in some complex scenes, such as containing diverse types of trees, urban forests and high forest density, the detection results are not satisfactory. Therefore, this paper presents a PointCNN-based method of 3D tree detection using LiDAR point clouds, which aims to improve the detection accuracy of trees in complex scenes and versatility. This method first builds a canopy height model (CHM) using raw LiDAR point clouds and obtains rough seed points on CHM. Then it extracts the detection samples consisting of single tree's point cloud data based on the rough seed points. Next, the 3D-CNN classifier based on PointCNN is adopted to classify detection samples, and the classification results are used for filtering seed points. Finally, our method performs the tree stagger analysis on those close seed points. This study selected twelve experimental plots from study areas in Bend, Central Oregon, USA. Based on the results of our experiments, the highest matching score and average score reached 91.0 and 88.3. Experimental results show that our method can effectively extract tree information in complex scenes.

Wenyuan Ying, Tianyang Dong, Zhanfeng Ding, Xinpeng Zhang

Variance Weight Distribution Network Based Noise Sample Learning for Robust Person Re-identification

Person re-identification (re-ID) usually requires a large amount of well-labeled training data to learn generalized discriminative person feature representations. Most of current deep learning models assume that all training data are correctly labeled. However, noisy data commonly exists due to incorrect labeling and person detector errors or occlusions in large scale practical applications. Both types of noisy data can influence model training, while they are ignored by most re-ID models so far. In this paper, we propose a robust deep re-ID model, called variance weight distribution network (VWD-Net), to address this problem. Different from the traditional representations of each person image as a feature vector, the variance weight distribution network focuses on the following three aspects. 1) An improved Gaussian distribution and its variance are used to represent the uncertainty of person features. 2) A well-designed loss in the variance weight distribution network is used to delegate the distribution uncertainty with respect to the training data. 3) The noisy labels are rectified for further optimization on the model training performance. The large scale variance/uncertainty has been assigned to noisy samples and then rectifies their labels, in order to mitigate their negative impact on the training process. Extensive experiments on two benchmarks demonstrate the robustness and effectiveness of VWD-Net.

Xiaoyi Long, Ruimin Hu, Xin Xu

Monocular Dense SLAM with Consistent Deep Depth Prediction

Monocular simultaneous localization and mapping (SLAM) that using a single moving camera for motion tracking and 3D scene structure reconstruction, is an essential task for many applications, such as vision-based robotic navigation and augmented reality (AR). However, most existing methods can only recover sparse or semi-dense point clouds, which are not adequate for many high-level tasks like obstacle avoidance. Meanwhile, the state-of-the-art methods use multi-view stereo to recover the depth, which is sensitive to the low-textured and non-Lambertian surface. In this work, we propose a novel dense mapping method for monocular SLAM by integrating deep depth prediction. More specifically, a classic feature-based SLAM framework is first used to track camera poses in real-time. Then an unsupervised deep neural network for monocular depth prediction is introduced to estimate dense depth maps for selected keyframes. By incorporating a joint optimization method, predicted depth maps are refined and used to generate local dense submaps. Finally, contiguous submaps are fused with the ego-motion constraint to construct the globally consistent dense map. Extensive experiments on the KITTI dataset demonstrate that the proposed method can remarkably improve the completeness of dense reconstruction in near real-time.

Feihu Yan, Jiawei Wen, Zhaoxin Li, Zhong Zhou

3D Shape-Adapted Garment Generation with Sketches

Garment generation or reconstruction is becoming extremely demanding for many digital applications, and the traditional process is time-consuming. In recent years, garment reconstruction from sketch leveraging deep learning and principal component analysis (PCA) has made great progress. In this paper, we present a data-driven approach wherein 3D garments are directly generated from sketches combining given body shape parameters. Our framework is an encoder-decoder architecture. In our network, sketch features extracted by DenseNet and body shape parameters were encoded to latent code respectively. Then, the new latent code obtained by adding two latent codes of the sketch and human body shape is decoded by a fully convolutional mesh decoder. Our network enables the body shape adapted detailed 3D garment generation by leveraging garment sketch and body shape parameters. With the fully convolutional mesh decoder, the network can show the effect of body shape and sketch on the generated garment. Experimental results show that the fully convolutional mesh decoder we used to reconstruct the garment performs higher accuracy and maintains lots of detail compared with the PCA-based method.

Yijing Chen, Chuhua Xian, Shuo Jin, Guiqing Li

Geometric Computing

Frontmatter

Light-Weight Multi-view Topology Consistent Facial Geometry and Reflectance Capture

We present a light-weight multi-view capture system with different lighting conditions to generate a topology consistent facial geometry and high-resolution reflectance texture maps. Firstly, we construct the base mesh from multi-view images using the stereo reconstruction algorithms. Then we leverage the mesh deformation technique to register a template mesh to the reconstructed geometry for topology consistency. The facial and ear landmarks are also utilized to guide the deformation. We adopt the photometric stereo and BRDF fitting methods to recover the facial reflectance field. The specular normal which contains high-frequency information is finally utilized to refine the coarse geometry for sub-millimeter details. The captured topology consistent finer geometry and high-quality reflectance information can be used to produce a lifelike personalized digital avatar.

Penglei Ji, Hanchao Li, Luyan Jiang, Xinguo Liu

Real-Time Fluid Simulation with Atmospheric Pressure Using Weak Air Particles

Atmospheric pressure is important yet often ignored in fluid simulation, resulting in many phenomena being overlooked. This paper presents a particle-based approach to simulate versatile liquid effects under atmospheric pressure in real time. We introduce weak air particles as a sparse sampling of air. The weak air particles can be used to efficiently track liquid surfaces under atmospheric pressure, and are weakly coupled with the liquid. We allow the large-mass liquid particles to contribute to the density estimation of small-mass air particles and neglect the air’s influence on liquid density, leaving only the surface forces of air on the liquid to guarantee the stability of the two-phase flow with a large density ratio. The proposed surface force model is composed of density-related atmospheric pressure force and surface tension force. By correlating the pressure and the density, we ensure that the atmospheric pressure increases as the air is compressed in a confined space. Experimental results demonstrate the efficiency and effectiveness of our methods in simulating the interplay between air and liquid in real time.

Tian Sang, Wentao Chen, Yitian Ma, Hui Wang, Xubo Yang

Human Poses and Gestures

Frontmatter

Reinforcement Learning for Quadruped Locomotion

In adversarial games like VR hunting which involves predators and preys, locomotive behaviour of the non-player character (NPC) is crucial. For effective and realistic quadruped locomotion, major technical contributions of this paper are made to inverse kinematics embedded motion control, quadruped locomotion behaviour adaptation and dynamic environment informed reinforcement learning (RL) of the NPC agent. Behaviour of each NPC can be improved from the top-level decision making such as pursuit and escape down to the actual skeletal motion of bones and joints. The new concepts and techniques are illustrated by a specific use case of predator and prey interaction, in which the objective is to create an intelligent locomotive predator to reach its autonomous steering locomotive prey as fast as possible in all the circumstances. Experiments and comparisons are conducted against the Vanilla dynamic target training; and the RL agent of the quadruped displays more realistic limb movements and produces faster locomotion towards the autonomous steering target.

Kangqiao Zhao, Feng Lin, Hock Soon Seah

Partially Occluded Skeleton Action Recognition Based on Multi-stream Fusion Graph Convolutional Networks

Skeleton-based action recognition methods have been widely developed in recent years. However, the occlusion problem is still a difficult problem at present. Existing skeleton action recognition methods are usually based on complete skeleton data, and their performance is greatly reduced in occluded skeleton action recognition tasks. In order to improve the recognition accuracy on occluded skeleton data, a multi-stream fusion graph convolutional network (MSFGCN) is proposed. The proposed multi-stream fusion network consists of multiple streams, and different streams can handle different occlusion cases. In addition, joint coordinates, relative coordinates, small-scale temporal differences and large-scale temporal differences are extracted simultaneously to construct more discriminative multimodal features. In particular, to the best of our knowledge, we are the first to propose the simultaneous extraction of temporal difference features at different scales, which can more effectively distinguish between actions with different motion amplitude. Experimental results show that the proposed MSFGCN obtains state-of-the-art performance on occluded skeleton datasets.

Dan Li, Wuzhen Shi

Social-Scene-Aware Generative Adversarial Networks for Pedestrian Trajectory Prediction

Pedestrian trajectory prediction is crucial across a wide range of applications like self-driving vehicles and social robots. Such prediction is challenging because crowd behavior is inherently determined by various factors, such as obstacles, stationary crowd groups and destinations which were difficult to effectively represent. Especially pedestrians tend to be greatly affected by the pedestrians in front of them more than those behind them, which were often ignored in literature. In this paper, we propose a novel framework of Social-Scene-Aware Generative Adversarial Networks (SSA-GAN), which includes three modules, to predict the future trajectory of pedestrians in dynamic scene. Specifically, in the Scene module, we model the original scene image into a scene energy map by combining various scene factors and calculating the probability of pedestrians passing at each location. And the modeling formula is inspired by the distance relationship between pedestrians and scene factors. Moreover, the Social module is used to aggregate neighbors’ interactions on the basis of the correlation between the motion history of pedestrians. This correlation is captured by the self-attention pooling module and limited by the field of view. And then the Generative Adversarial module with variety loss can solve the multimodal problem of pedestrian trajectory. Extensive experiments on publicly available datasets validate the effectiveness of our method for crowd behavior understanding and trajectory prediction.

Binhao Huang, Zhenwei Ma, Lianggangxu Chen, Gaoqi He

Image Processing

Frontmatter

Cecid Fly Defect Detection in Mangoes Using Object Detection Frameworks

Mango export has experienced rapid growth in global trade over the past few years, however, they are susceptible to surface defects that can affect their market value. This paper investigates the automated detection of a mango defect caused by cecid flies, which can affect a significant portion of the production yield. Object detection frameworks using CNN were used to localize and detect multiple defects present in a single mango image. This paper also proposes modified versions of R-CNN and FR-CNN replacing its region search algorithms with segmentation-based region extraction. A dataset consisting of 1329 cecid fly surface blemishes was used to train the object detection models. The results of the experiments show comparable performance between the modified and existing state-of-the-art object detection frameworks. Results show that Faster R-CNN achieved the highest average precision of 0.901 at $$aP_{50}$$ a P 50 while the Modified FR-CNN has the highest average precision of 0.723 at $$aP_{75}$$ a P 75 .

Maria Jeseca C. Baculo, Conrado Ruiz Jr, Oya Aran

Twin-Channel Gan: Repair Shape with Twin-Channel Generative Adversarial Network and Structural Constraints

The establishment of 3D content with deep learning has been a focus of research in computer graphics during past years. Recently, researchers analyze 3D shapes through the dividing-and-conquer strategy with the geometry information and the structure information. Although many works perform well, there are still several problems. For example, the geometry information missing and not plausible in structure. In this work, we propose the Twin-channel GAN for the 3D shape completion. In this framework, the structure information is well studied via the structural constraints for optimizing the details of 3D shapes. The experimental results also demonstrated that our method achieves better performance.

Zhenjiang Du, Ning Xie, Zhitao Liu, Xiaohua Zhang, Yang Yang

CoPaint: Guiding Sketch Painting with Consistent Color and Coherent Generative Adversarial Networks

Art design plays an important role in attracting users. Thro- ugh art design, some sketches are more in line with aesthetics. Traditionally, we need to artificially color many series of black-and-white sketches using the same color, which is time-consuming and difficult for art designers. In addition, coherent sketch painting is challenging to automate. We propose a GAN-based approach CoPaint for sketch colorization. Our neural network takes as its input two black-and-white sketches with different rotation angles and produces a series of high-quality colored images of consistent color. We present an approach to generate a coherent sketch painting dataset. We also propose a paired generator network with shared weights that consists of convolutional layers and batch-normal layers. In addition, we propose a similarity loss that makes the images produced by the generator more similar. The provided experiments demonstrate the effectiveness of our approach.

Shiqi Jiang, Chenhui Li, Changbo Wang

Multi-Stream Fusion Network for Multi-Distortion Image Super-Resolution

Deblurring, denoising and super-resolution (SR) are important image recovery tasks that are committed to improving image quality. Despite the rapid development of deep learning and vast studies on improving image quality have been proposed, the most existing recovery solutions simply deal with quality degradation caused by a single distortion factor, such as SR focusing on improving spatial resolution. Since very little work has been done to analyze the interaction and characteristics of the deblurring, denoising and SR mixing problems, this paper considers the multi-distortion image recovery problem from a holistic perspective and introduces an end-to-end multi-stream fusion network (MSFN) to restore a multi-distortion image (low-resolution image with noise and blur) into a clear high-resolution (HR) image. Firstly, MSFN adopts multiple reconstruction branches to extract deblurring, denoise and SR features with respect to different degradations. Then, MSFN gradually fuses these multi-stream recovery features in a determined order and obtains an enhanced restoration feature by using two fusion modules. In addition, MSFN uses fusion modules and residual attention modules to facilitate the fusion of different recovery features from the denoising branch and the deblurring branch for the trunk SR branch. Experiments on several benchmarks fully demonstrate the superiority of our MSFN in solving the multi-distortion image recovery problem.

Yang Wen, Yupeng Xu, Bin Sheng, Ping Li, Lei Bi, Jinman Kim, Xiangui He, Xun Xu

Generative Face Parsing Map Guided 3D Face Reconstruction Under Occluded Scenes

Over the past few years, single-view 3D face reconstruction methods can produce beautiful 3D models. Nevertheless, the input of these works is unobstructed faces. We describe a system designed to reconstruct convincing face texture in the case of occlusion. Motivated by parsing facial features, we propose a complete face parsing map generation method guided by landmarks. We estimate the 2D face structure of the reasonable position of the occlusion area, which is used for the construction of 3D texture. An excellent anti-occlusion face reconstruction method should ensure the authenticity of the output, including the topological structure between the eyes, nose, and mouth. We extensively tested our method and its components, qualitatively demonstrating the rationality of our estimated facial structure. We conduct extensive experiments on general 3D face reconstruction tasks as concrete examples to demonstrate the method’s superior regulation ability over existing methods often break down. We further provide numerous quantitative examples showing that our method advances both the quality and the robustness of 3D face reconstruction under occlusion scenes.

Dapeng Zhao, Yue Qi

Compact Double Attention Module Embedded CNN for Palmprint Recognition

Palmprint-based biometric recognition has received tremendous attention due to its several advantages such as high security, non-invasive and good hygiene propensities. Recent deep convolutional neural network (CNN) has been successfully applied for palmprint recognition and achieved promising performance due to its breakthroughs in image classification, which however usually requires a massive amount of labeled samples to finetune the network. In this paper, we propose a compact CNN with limited layers for palmprint recognition by embedding double attention mechanisms into the convolutional layers. Specifically, we first design a channel attention module to learn and select the discriminative channel maps by adaptively optimizing the attention weights among all channels. Then, we engineer a location attention module to learn the position-specific features of the palmprints. Both the channel and location attention modules are subsequently embedded into each convolutional layer, such that more discriminative features can be efficiently exploited during feature learning. Lastly, we train a fully convolutional network as the classifier for feature identification. Extensive experimental results on three widely used databases demonstrate the effectiveness of the proposed method in comparison with the state-of-the-arts.

Yongmin Zheng, Lunke Fei, Wei Jia, Jie Wen, Shaohua Teng, Imad Rida

M2M: Learning to Enhance Low-Light Image from Model to Mobile FPGA

With the development of convolutional neural networks, the effectiveness of low-light image enhancement techniques have been greatly advanced. However, the calculate-intensive and memory-intensive characteristics of convolutional neural networks make them difficult to be implemented in mobile platform with low power and limited bandwidth. This paper proposes a complete solution for low-light image enhancement from CNN model to mobile (M2M) FPGA. The proposed solution utilizes a pseudo-symmetry quantization method to compress the low-light image enhancement model, and an accelerator to permit the processing ability of the system significantly. We implemented the whole system on a customized FPGA SOC platform (a low-cost chip, Xilinx Inc. ZYNQ $$^{TM}$$ TM XC7Z035). Extensive experiments show that our method achieved competitive results with the other three platforms, i.e. achieved better speed compare to ARM and CPU; and achieved better power efficiency compared to ARM, CPU, and GPU.

Ying Chen, Wei Wang, Wei Hu, Xin Xu

Character Flow Detection and Rectification for Scene Text Spotting

Text can be widely found in natural scenes. However, it is considerably difficult to detect and recognize the scene text due to its variations and distortions. In this paper, we propose a three-stage bottom-up scene text spotter, including text segmentation, text rectification and text recognition. The text segmentation part adopts a feature pyramid network (FPN) to extract character instances by combining local and global information, then a joint network of FPN and bidirectional long short-term memory is developed to explore the affinity among the isolated characters, which are grouped into character flows. The text rectification part utilizes a spatial transformer network to deal with the complex deformation of the character flows, thus enhancing their readability. Finally, the rectified text is recognized through an attention-based sequence recognition network. Extensive experiments are conducted on several benchmarks, showing that our approach achieves the state-of-the-art performance.

Beiji Zou, Wenjun Yang, Kaiwen Li, Enquan Huang, Shu Liu

A Deep Learning Method for 2D Image Stippling

Stippling is a fascinating art form, which is widely used in printing industry. In computer graphics, digital color stippling produces colored points with a certain distribution (e.g. blue noise distribution) from an input color image. However, it is challenging as each color channel should be evenly distributed with respect to each other channel. Deep learning approaches have shown great advantage on many image stylization applications and have not been utilized for stippling yet. The main reason is that stippling has strict constrains, which requires an even and random distribution of the points. In this paper, we propose the first deep learning approach for stippling, which is able to produce point distribution visually similar to stippling. We regard the stippling results as a 3D point cloud structure where the third channel represents for colors. Then we propose a deep network to transform images to points distribution, consisting of a feature extracting encoder to extract features from the input image and a point generating decoder to translate the features into stippling form. We exploit a spectrum loss to achieve the even distribution. As a result, our method can produce color stippling with reasonable cost. Experiments show that our method can produce stippling with a reasonable balance between the quality of the results and the computational efficiency.

Zhongmin Xue, Beibei Wang, Lei Ma

Medical Imaging

Frontmatter

In Silico Heart Versatile Graphical Interface with Systole and Diastole Phases Customizable for Diversified Arrhythmias Simulations

Heart computational models in graphical interfaces provide realistic cardiac beating simulation and suitable user interactivity. This work presents an interface for heart simulation specially devised to control the cardiac systole and diastole phases for arrhythmia simulations while simultaneously interacting with the heart model. The simulation consists of rigging the mesh of a 3D heart model to generate keyframes for morphing. The interface provides cardiac beating motion simulation at a regular rate and arrhythmias adjustable by the user. It also provides, in real-time, information and control of the cardiac phase durations. Furthermore, the user can manipulate and interact with the heart model using a naked-eye hologram interface. The interface is applicable in cardiology education and training and can upgrade for exploring new medical applications in In-Silico cardiology. The present work’s main contribution and technical novelty concern devising a heart beating simulator customizable to the patient cardiac rhythm.

C. M. G. Godoy, M. C. Selusniacki, V. S. dos Santos, C. C. Godoy, G. M. dos Santos, R. C. Coelho

ADD-Net:Attention U-Net with Dilated Skip Connection and Dense Connected Decoder for Retinal Vessel Segmentation

Retinal vessel segmentation is an essential step in the diagnosis of many diseases. Due to the large number of capillaries and complex branch structure, efficient and accurate segmentation of fundus vessels faces a huge challenge. In this paper, we propose an improved U-shape network, aiming at the problem of complex vessel segmentation, especially those thin, obscure ones. Firstly, we propose a new attention module, including channel attention and spatial attention, to build the connection between channels and learn to focus on those crucial representations. Secondly, we improve the skip connection by adding dilated convolutions, which can not only coping with the problem of semantic gap between the low-dimension and high-dimension features but also extract rich context information in encoder. Finally, the idea of dense connection is adopted in the decoder to fuse the feature representations with low computation cost and parameters. Experimental results show that our method could efficiently obtain the accurate segmentation image and achieve state-of-the-art performance on the public datasets DRIVE and CHASE_DB1.

Dongjin Huang, Hao Guo, Yue Zhang

BDFNet: Boundary-Assisted and Discriminative Feature Extraction Network for COVID-19 Lung Infection Segmentation

The coronavirus disease (COVID-19) pandemic has affected billions of lives around the world since its first outbreak in 2019. The computed tomography (CT) is a valuable tool for the COVID-19 associated clinical diagnosis, and deep learning has been extensively used to improve the analysis of CT images. However, owing to the limitation of the publicly available COVID-19 imaging datasets and the randomness and variability of the infected areas, it is challenging for the current segmentation methods to achieve satisfactory performance. In this paper, we propose a novel boundary-assisted and discriminative feature extraction network (BDFNet), which can be used to improve the accuracy of segmentation. We adopt the triplet attention (TA) module to extract the discriminative image representation, and the adaptive feature fusion (AFF) module to fuse the texture information and shape information. In addition to the channel and spatial dimensions that are mainly used in previous models, the cross channel-special context is also obtained in our model via the TA module. Moreover, fused hierarchical boundary information is integrated through the application of the AFF module. According to experiments conducted on two publicly accessible COVID-19 datasets, COVID-19-CT-Seg and CC-CCII, BDFNet performs better than most cutting-edge segmentation algorithms in six widely used segmentation metrics.

Hui Ding, Qirui Niu, Yufeng Nie, Yuanyuan Shang, Nianzhe Chen, Rui Liu

A Classification Network for Ocular Diseases Based on Structure Feature and Visual Attention

With the rapid development of digital image processing and machine learning technology, computer-aided diagnosis for ocular diseases is more active in the medical image processing and analysis field. Optical coherence tomography (OCT), as one of the most promising new tomography techniques, has been widely used in the clinical diagnosis of ophthalmology and dentistry. To overcome the lack of professional ophthalmologists and realize the intelligent diagnosis of different ocular diseases, we propose a convolutional neural network (CNN) based on structure feature and visual attention for ocular diseases classification. We firstly preprocess the OCT images according to the OCT data characteristics to enhance the OCT image quality. Meanwhile, we propose to use the CNN with structure prior to classify five kinds of ocular diseases, including age-related macular degeneration (AMD), diabetic macular edema (DME), normal (NM), polypoidal choroidal vasculopathy (PCV), and pathologic myopia (PM). Besides, the visual attention mechanism is also used to enhance the ability of the network to represent effective features. The experimental results show that our method can outperform most of the state-of-the-art algorithms in the classification accuracy of different ocular diseases on the OCT dataset.

Yang Wen, Yupeng Xu, Kun Liu, Bin Sheng, Lei Bi, Jinman Kim, Xiangui He, Xun Xu

Physics-Based Simulation

Frontmatter

DSNet: Dynamic Skin Deformation Prediction by Recurrent Neural Network

Skin dynamics contributes to the enriched realism of human body models in rendered scenes. Traditional methods rely on physics-based simulations to accurately reproduce the dynamic behavior of soft tissues. Due to the model complexity, however, they do not directly offer practical solutions to domains where real-time performance is desirable. The quality shapes obtained by physics-based simulations are not fully exploited by example-based or more recent data-driven methods neither, with most of them having focused on the modeling of static skin shapes. To address these limitations, we present a learning-based method for dynamic skin deformation. At the core of our work is a recurrent neural network that learns to predict the nonlinear, dynamics-dependent shape change over time from pre-existing mesh sequences. After training the network delivers realistic, high-quality skin dynamics that is specific to a person in a real-time course. We obtain results that significantly saves the computational time, while maintaining comparable prediction quality compared to state-of-the-art.

Hyewon Seo, Kaifeng Zou, Frederic Cordier

Curvature Analysis of Sculpted Hair Meshes for Hair Guides Generation

This paper proposes an approach that generates hair guides from a sculpted 3D mesh, thus accelerating hair creation. Our approach relies on the local curvature on a sculpted mesh to discover the direction of the hair on the surface. We generate hair guides by following the identified strips of polygons matching hair strands. To improve the quality of the guides, some are split to ensure they correspond to hairstyles ranging from straight to wavy, while others are connected so that they correspond to longer hair strands. In order to automatically attach the guides to the scalp of a 3D head, a vector field is computed based on the directions of the guides, and is used in a backward growth of the guides toward the scalp. This approach is novel since there is no state-of-the-art method that generates hair from a sculpted mesh. Furthermore, we demonstrate how our approach works on different hair meshes. Compared to several hours of manual work to achieve a similar result, our guides are generated in a few minutes.

Florian Pellegrin, Andre Beauchamp, Eric Paquette

Synthesizing Human Faces Using Latent Space Factorization and Local Weights

We propose a 3D face generative model with local weights to increase the model’s variations and expressiveness. The proposed model allows partial manipulation of the face while still learning the whole face mesh. For this purpose, we address an effective way to extract local facial features from the entire data and explore a way to manipulate them during a holistic generation. First, we factorize the latent space of the whole face to the subspace indicating different parts of the face. In addition, local weights generated by non-negative matrix factorization are applied to the factorized latent space so that the decomposed part space is semantically meaningful. We experiment with our model and observe that effective facial part manipulation is possible, and that the model’s expressiveness is improved.

Minyoung Kim, Young J. Kim

CFMNet: Coarse-to-Fine Cascaded Feature Mapping Network for Hair Attribute Transfer

Recently, GAN-based manipulation methods have been proposed to effectively edit and transfer facial attributes. However, these state-of-the-art methods usually fail to delicately manipulate hair attributes because hair does not own a concrete shape and varies a lot with flexible structure. Therefore, how to achieve high-fidelity hair attribute transfer becomes a challenging task. In this paper, we propose a coarse-to-fine cascaded feature mapping network (CFMNet), which can disentangle hair into coarse-grained and fine-grained attributes, and transform hair feature in latent space according to a reference image. The disentangled hair attributes consist of the coarse-grained labels, including length, waviness and bangs, and the fine-grained 3D model, including geometry and color. Next we design a cascaded feature mapping network to manipulate the attributes in a coarse-to-fine way between source and reference images, which can adjust and control hair feature more delicately. Moreover, we also construct an identity loss to avoid the destruction of identity information in source image. A variety of experimental results demonstrate the effectiveness of our proposed method.

Zhifeng Xie, Guisong Zhang, Chunpeng Yu, Jiaheng Zheng, Bin Sheng

Rendering and Textures

Frontmatter

Dynamic Shadow Synthesis Using Silhouette Edge Optimization

The shadow volume is utilized extensively for real-time rendering applications which includes updating volumes and calculating silhouette edges. Existing shadow volume methods are CPU intensive and complex occluders result in poor rendering efficiency. In this paper, we propose a hash-culling shadow volume algorithm that uses hash-based acceleration for the silhouette edge determination which is the most time-consuming processing in the traditional shadow volume algorithm. Our proposed method uses a hash table to store silhouette edge index information and thus reduces the time taken for redundant edge detection. The method significantly reduces CPU usage and improves algorithm time efficiency. Furthermore, for low hardware-level systems, especially embedded systems, it is still difficult to render dynamic shadows due to their high demand on the fill-rate capacity of graphics hardware. Our method has low hardware requirements and is easy to implement on PCs and embedded systems with real-time rendering performance with visual-pleasing shadow effects.

Jihong Wang, Zhen Li, Saba Ghazanfar Ali, Bin Sheng, Ping Li, Xiaoyu Chi, Jinman Kim, Lijuan Mao

DDISH-GI: Dynamic Distributed Spherical Harmonics Global Illumination

We propose a real-time hybrid rendering algorithm that off-loads computationally complex rendering of indirect lighting from mobile client devices to dedicated ray tracing hardware on the server with a hybrid real-time computer graphics rendering algorithm. Spherical harmonics (SH) light probes are updated with path tracing on the server side, and the final frame is rendered with a fast rasterization-based pipeline that uses the light probes to approximate high quality indirect diffuse lighting and glossy specular reflections. That is, the rendering workload can be split to multiple devices across the network with a small bandwidth usage. It also benefits multi-user and multi-view scenarios by separating indirect lighting computation from camera positioning. Compared to simply streaming fully remotely rendered frames, the approach is more robust to network interruptions and latency. Furthermore, we propose a specular approximation for GGX materials via zonal harmonics (ZH). This alleviates the need to implement more computationally complex algorithms, such as screen space reflections, which was suggested in the state-of-the-art dynamic diffuse global illumination (DDGI) method. We show that the image quality of the proposed method is similar to that of DDGI, with a 23 times more compact data structure.

Julius Ikkala, Petrus Kivi, Joel Alanko, Markku Mäkitalo, Pekka Jääskeläinen

Simplicity Driven Edge Refinement and Color Reconstruction in Image Vectorization

Gestalt psychology indicates that simplicity is central to image vectorization, i.e., observers tend to perceive jagged raster edges as piecewise smooth curves and color changes as being either gradual (along edges) or abrupt (across edges). In this paper, we give a pair of simplicity-driven formulations to respectively cope with the two challenges. In detail, we formulate the underlying as-rigid-as-possible edges as the axes of symmetry of the edge saliency map, while reconstructing the color field by enforcing the fidelity and the smoothness at the same time (except on the detected boundaries). We finally convert a rasterized image into gradient-aware vector graphics whose base domain is a high-quality triangle mesh. On the one hand, the rigidity of the boundary curves is naturally achieved based on the assumption of simplicity, instead of by an empirically-grounded curve fitting operation; on the other hand, the color of near-boundary regions is inferred by Hessian energy (an extrapolation-like technique). Our vectorization method is able to yield more visually realistic results than existing approaches and is useful in flexible recoloring, shape editing, and hierarchical level-of-detail (HLOD) image representation.

Zheng Zhang, Junhao Zhao, Shiqing Xin, Shuangmin Chen, Yuanfeng Zhou, Changhe Tu, Wenping Wang

Temporal-Consistency-Aware Video Color Transfer

This paper proposes a new temporal-consistency-aware color transfer method based on quaternion distance metric. Compared with the state-of-the-art methods, our method can keep the temporal consistency and better reduce the artifacts. Firstly, keyframes are extracted from the source video and transfer the color from the reference image through soft segmentation based on Gaussian Mixture Models (GMM). Then a quaternion-based method is proposed to transfer color from keyframes to the other frames iteratively. Specifically, this method analyses the color information of each pixel along five directions to detect its best matching pixel through a quaternion-based distance metric. Additionally, considering the accumulating errors in frame sequences, an effective abnormal color correction mechanism is designed to improve the color transfer quality. A quantitative evaluation metric is further proposed to measure the temporal consistency in the output video. Various experimental results validate the effectiveness of our method.

Shiguang Liu, Yu Zhang

An Improved Advancing-front-Delaunay Method for Triangular Mesh Generation

The triangular mesh is widely used in computer graphics. The advancing-front-Delaunay method is a mainstream method to generate the triangular mesh. However, it generates interior nodes on the basis of the segment front and needs to manage and update the generation segment front set carefully. This paper describes an improved advancing-front-Delaunay method that generates interior nodes based on the node front. The idea of node front can be implemented easily by our disk packing algorithm and does not need a complicated management strategy. Besides, unlike the traditional advancing-front-Delaunay method that generates interior node and the mesh at the same time, the method generates all the nodes firstly by the disk packing method, then generates the mesh. Hence, the method can be more efficient using these more efficient algorithms for a given fixed node-set to generate Delaunay triangular meshes or these more efficient algorithms with a carefully designed insertion sequence to insert the interior nodes. Four examples are given to show the effectiveness and robustness of the improved advancing-front-Delaunay method.

Yufei Guo, Xuhui Huang, Zhe Ma, Yongqing Hai, Rongli Zhao, Kewu Sun

Robotics and Vision

Frontmatter

Does Elderly Enjoy Playing Bingo with a Robot? A Case Study with the Humanoid Robot Nadine

There are considerable advancements in medical health care in recent years, resulting in a rising older population. As the workforce for such a population is not keeping pace, there is an urgent need to address this problem. Having robots to stimulating recreational activities for older adults can reduce the workload for caretakers and give them time to address the emotional needs of the elderly. In this paper, we investigate the effects of the humanoid social robot Nadine as an activity host for the elderly. We propose to evaluate this by placing Nadine humanoid social robot in a nursing home as a caretaker where she hosts bingo game. We record sessions with and without Nadine to understand the difference and acceptance of these two scenarios. We use computer vision methods to analyse the activities of the elderly to detect emotions and their involvement in the game. Our results present positive enforcement during recreational activity, Bingo, in the presence of Nadine. This research is in line with all ethical recommendations as shown in our Annex.

Nidhi Mishra, Gauri Tulsulkar, Hanhui Li, Nadia Magnenat Thalmann, Lim Hwee Er, Lee Mei Ping, Cheng Siok Khoong

Resilient Navigation Among Dynamic Agents with Hierarchical Reinforcement Learning

Behaving safe and efficient navigation policy without knowing surrounding agents’ intent is a hard problem. This problem is challenging for two reasons: the agent need to face high environment uncertainty for it can’t control other agents in the environment. Moreover, the navigation algorithm need to be resilient to various scenes. Recently reinforcement learning based navigation has attracted researchers interest. We present a hierarchical reinforcement learning based navigation algorithm. The two-level structure decouples the navigation task into target driven and collision avoidance, leading to a faster and more stable model to be trained. Compared with the reinforcement learning based navigation methods in recent years, we verified our model on navigation ability and the resilience on different scenes.

Sijia Wang, Hao Jiang, Zhaoqi Wang

Visual Analytics

Frontmatter

MeshChain: Secure 3D Model and Intellectual Property management Powered by Blockchain Technology

The intellectual value of digitized 3D properties in scientific, artistic, historical, and entertaining domains is increasing. However, there has been less attention on designing an immutable, secure database for their management. We propose a secure 3D property management platform powered by blockchain and decentralized storage. The platform connects various 3D modeling tools to a decentralized network-based database constructed on blockchain and decentralized storage technologies and provides the commit and checkout of the 3D model to that network. This structure provides 3D data protection from damages and attacks, intellectual property (IP) management, and data source authentication. We analyze its performance and show its applications to cooperative 3D modeling and IP management.

Hunmin Park, Yuchi Huo, Sung-Eui Yoon

Image Emotion Analysis Based on the Distance Relation of Emotion Categories via Deep Metric Learning

Existing deep learning-based image emotion analysis methods regard image emotion classification as a usual classification task in which the semantics of categories are clear. Nevertheless, the semantics of emotion categories are fuzzy, leading to that people are ambiguous between emotions of similar semantic distance when observing images. Considering the semantic distance of emotion categories, that is, far or near distance relations between them, we design a similarity decline rule to first pre-process the similarities of sample pairs making them comparable. Then, image emotion analysis is performed through deep metric learning. For key issues in deep metric learning, that is, sampling and weighting, we design adaptive decision boundaries for sampling and a double-weighted mechanism for sampled pairs which is integrated in our proposed emotion constraint loss, which learns more information contributing to update model by boasting the weights. Therefore, more expressive embedding features are learned from embedding space. Thus, the similarity of pairs from adjacent categories is larger than that from far away ones. The experimental results demonstrate that our proposed method outperforms the state-of-the-art methods. In addition, the ablation experiments show that it is necessary to consider the semantic distance of emotion categories in image emotion analysis.

Guoqin Peng, Hao Zhang, Dan Xu

How Much Do We Perceive Geometric Features, Personalities and Emotions in Avatars?

The goal of this paper is to evaluate the human perception regarding geometric features, personalities and emotions in avatars. To achieve this, we used a dataset that contains pedestrian tracking files captured in spontaneous videos which are visualized as identical virtual human beings. The main objective is to focus on individuals motion, not having the distraction of other features. In addition to tracking files containing pedestrian positions, the dataset also contains emotion and personality data for each pedestrian, detected through computer vision and pattern recognition techniques. We are interested in evaluating whether participants can perceive geometric features such as density levels, distances, angular variations and speeds, as well as cultural features (emotions and personality traits) in short video sequences (scenes), when pedestrians are represented by avatars. With this aim in mind, we propose two questions to be answered through this analysis: i) “Can people perceive geometric features in avatars?"; and ii) “Can people perceive differences regarding personalities and emotions in virtual humans without body and facial expressions?". Regarding the participants, 73 people volunteered for the experiment in order to answer the two mentioned questions. Results indicate that, even without explaining to the participants the concepts of cultural features and how they were calculated (considering the geometric features), in most cases the participants perceived the personality and emotion expressed by avatars, even without faces and body expressions.

Victor Araujo, Bruna Dalmoro, Rodolfo Favaretto, Felipe Vilanova, Angelo Costa, Soraia Raupp Musse

High-Dimensional Dataset Simplification by Laplace-Beltrami Operator

With the development of the Internet and other digital technologies, the speed of data generation has become considerably faster than the speed of data processing. Because big data typically contain massive redundant information, it is possible to significantly simplify a big dataset while maintaining the key information. In this paper, we develop a high-dimensional (HD) dataset simplification method based on the eigenvalues and eigenfunctions of the Laplace-Beltrami operator (LBO). Specifically, given a dataset that can be considered as an unorganized data point set in an HD space, a discrete LBO defined on the HD dataset is constructed, and its eigenvalues and eigenvectors are calculated. Then, the local extremum and saddle points of the eigenvectors are proposed to be the feature points of the HD dataset, constituting a simplified dataset. Moreover, we develop feature point detection methods for the functions defined on an unorganized data point set in HD space, and devise metrics for measuring the fidelity of the simplified dataset to the original set. Finally, examples and applications are demonstrated to validate the efficiency and effectiveness of the proposed methods, demonstrating that the developed HD dataset simplification method is feasible for processing a maximum-sized dataset using a limited data processing capability.

Chenkai Xu, Hongwei Lin

VR/AR

Frontmatter

Characterizing Visual Acuity in the Use of Head Mounted Displays

In the real world, the sense of sight is dominant for humans, and having a normal vision is essential to perform well in many common tasks. There, ophthalmology has several tools to assess and correct a person’s vision. In VR, when wearing an HMD, even a user with normal vision is challenged by additional hurdles that affect the virtual environment’s perceptual acuity, negatively impacting their performance in the application task. Display resolution, but also soiled lenses and bad vergence adjustment are examples of possible issues. To better understand and tackle this problem, we provide a study on assessing visual acuity in a VR setup. We conducted an experimental evaluation with users and found out, among other results, that visual acuity in VR is significantly and considerably lower than in real environments. Besides, we found several correlations of the measured acuity and task performance with difficulty adjusting the HMD and use of prescription glasses.

Vladimir Soares da Fontoura, Anderson Maciel

Effects of Different Proximity-Based Feedback on Virtual Hand Pointing in Virtual Reality

Virtual hand pointing is a natural interaction method, however, suffers from the issue of depth perception in virtual environments. The proximity feedback cues, which deliver intensity information once the pointer is getting closer to the target, may improve virtual hand pointing performance in virtual environments. However, less is known about the effects of such feedback cues on the depth movement phases of virtual hand pointing task. Therefore, this work focuses on the effects of different feedback cues (either visual (V), auditory (A), haptic (H), or any combinations of them) on a virtual hand pointing task in view and lateral directions in virtual environments. Results show that compared with other feedback types, haptic feedback cue significantly reduced movement time, particularly in a larger visual depth. We further analyzed the sub-movement time phases (e.g. ballistic and correction), and found that the participants achieved the shortest ballistic time with A+H and A+H+V, and shortest correct time with H. However, no significant differences of feedback conditions on the speed, error rate, and throughout were found. In addition, we discuss the implications based on the findings and present the future work.

Yujun Lu, BoYu Gao, Huawei Tu, Weiqi Luo, HyungSeok Kim

Virtual Scenes Construction Promotes Traditional Chinese Art Preservation

Chinese traditional opera is a valuable and fascinating heritage assert in the world as one of the most representative folk art in Chinese history. Its characteristic of ‘suppositionality’ in stage scenery provides a possibility of preservation of cultural heritage by digitization means, e.g., 3D Animation and Virtual Reality-based art show. In this novel digitization art form, the construction of virtual scenes is an important pillar--variety of created models should be accommodated to provide a vivid performance stage, including stage props and background. However, the generation of scenes based on traditional manual 3D virtual props modelling method is a tedious and strenuous task. In this paper, a novel shadow puppetry virtual stage scenes construction approach based on semantic and prior probability is proposed for the generation of compositional virtual scenes. First, primitive models based on semantics text segmentation and retrieval is provided for scene composition; and then, scene placement algorithm based on prior probability is conducive to assign these 3D models within virtual scene. This method is tested by generating the virtual performance stage for our shadow puppetry prototype system, within which various traditional art-specific 3D models are assembled. Its ease of use can assist artists to create visually plausible virtual stage without professional scene modelling skill. The user study indicates our approach’s effectiveness and its efficiency.

Hui Liang, Fanyu Bao, Yusheng Sun, Chao Ge, Jian Chang

A Preliminary Work: Mixed Reality-Integrated Computer-Aided Surgical Navigation System for Paranasal Sinus Surgery Using Microsoft HoloLens 2

Paranasal sinus surgery has high demands for minimal invasion and safety. Computer-aided surgical navigation (CSN) applications have been recognized as the standard of the surgical practice; the operations of a user can be guided with visually complementary data such as preoperative medical imaging. The introduction of new innovation from mixed reality head mounted display (MR-HMD) technologies is a promising research direction for enhanced usability of paranasal sinus CSN applications. The combined use of MR-HMD with CSN provides a physically unified environment where a user's field of view in intraoperative sites can be augmented with complementary preoperative data, thereby enhancing their situational awareness. In this study, we present an early phase of the MR introduction for paranasal sinus surgery. We developed an alpha version of a commercial paranasal sinus CSN application using 3D Slicer, a dominant open-source clinical software development platform, and then implemented a scene sharing extension module. We refer to it as MR-CSN system. It enables a user wearing MR-HMD networked to equip with their MR-enhanced navigation; their navigation using surgical instruments in the intraoperative sites can be aided with the real-time information from the CSN application. The feasibility of our MR-CSN system was evaluated by experimenting a paranasal sinus surgical simulation with a phantom model.

Sungmin Lee, Hoijoon Jung, Euro Lee, Younhyun Jung, Seon Tae Kim

Engage

Frontmatter

Algorithms for Multi-conditioned Conic Fitting in Geometric Algebra for Conics

We introduce implementations of several conic fitting algorithms in Geometric algebra for conics. Particularly, we incorporate additional conditions into the optimisation problem, such as centre point position at the origin of coordinate system, axial alignment with coordinate axes, or, eventually, combination of both. We provide mathematical formulation together with the implementation in MATLAB. Finally, we present examples on a sample dataset and offer possible use of the algorithms.

Pavel Loučka, Petr Vašík

Special Affine Fourier Transform for Space-Time Algebra Signals

We generalize the space-time Fourier transform (SFT) [9] to a special affine Fourier transform (SASFT, also known as offset linear canonical transform) for 16-dimensional space-time multivector Cl(3, 1)-valued signals over the domain of space-time (Minkowski space) $$\mathbb {R}^{3,1}$$ R 3 , 1 . We establish how it can be computed in terms of the SFT, and introduce its properties of multivector coefficient linearity, shift and modulation, inversion, Rayleigh (Parseval) energy theorem, partial derivative identities, a directional uncertainty principle and its specialization to coordinates.

Eckhard Hitzer

On Explicit Formulas for Characteristic Polynomial Coefficients in Geometric Algebras

In this paper, we discuss characteristic polynomials in (Clifford) geometric algebras $$\mathcal {G}_{p,q}$$ G p , q of vector space of dimension $$n=p+q$$ n = p + q . For the first time, we present explicit formulas for all characteristic polynomial coefficients in the case $$n=5$$ n = 5 . The formulas involve only the operations of geometric product, summation, and operations of conjugation. For the first time, we present an analytic proof of the corresponding formulas in the case $$n=4$$ n = 4 . We present some new properties of the operations of conjugation and grade projection and use them to obtain the main results of this paper. The results of this paper can be used in different applications of geometric algebras in computer graphics, computer vision, engineering, and physics. The presented explicit formulas for characteristic polynomial coefficients can also be used in symbolic computation.

Kamron Abdulkhaev, Dmitry Shirokov

Unified Expression Frame of Geodetic Stations Based on Conformal Geometric Algebra

Global geodetic stations are the basis for the study of International Terrestrial Reference Frame (ITRF) and crustal plate motion. This paper has collected a total of 288 global geodetic stations calculated by four spatial geodetic technologies. Because different types of geodetic stations use different spatial references, the data of different types of stations can only be used after necessary spatial conversion. And with the motion of the Earth’s crust, conversion methods and formulas are constantly being revised. The complexity and dynamics of spatial conversion have greatly hindered the integrated application of different global geodetic stations. This paper proposes a unified expression frame for global geodetic stations. Based on the theory of conformal geometric algebra, the motion operator that uniformly expresses translation, rotation, and scaling in the form of versor product is introduced. From the perspective of the unified expression and calculation of the operator, a unified expression frame and ITRF conversion method for different reference frames of the geodetic station are constructed. The experimental case shows that the ITRF conversion method based on the unified motion operator reduces the complexity of the reference frame conversion, which provides a certain reference for the unified expression and analysis of different frame conversion and crustal plate motion.

Zhenjun Yan, Zhaoyuan Yu, Yun Wang, Wen Luo, Jiyi Zhang, Hong Gao, Linwang Yuan

Never ‘Drop the Ball’ in the Operating Room: An Efficient Hand-Based VR HMD Controller Interpolation Algorithm, for Collaborative, Networked Virtual Environments

In this work, we propose two algorithms that can be applied in the context of a networked virtual environment to efficiently handle the interpolation of displacement data for hand-based VR HMDs. Our algorithms, based on the use of dual-quaternions and multivectors respectively, impact the network consumption rate and are highly effective in scenarios involving multiple users. We illustrate convincing results in a modern game engine and a medical VR collaborative training scenario.

Manos Kamarianakis, Nick Lydatakis, George Papagiannakis

The Rules of 4-Dimensional Perspective: How to Implement Lorentz Transformations in Relativistic Visualization

This paper presents a pedagogical introduction to the issue of how to implement Lorentz transformations in relativistic visualization. The most efficient approach is to use the even geometric algebra in 3+1 spacetime dimensions, or equivalently complex quaternions, which are fast, compact, and robust, and straightforward to compose, interpolate, and spline. The approach has been incorporated into the Black Hole Flight Simulator, an interactive general relativistic ray-tracing program developed by the author.

Andrew J. S. Hamilton

Springer Professional

Über dieses Buch

Inhaltsverzeichnis

Frontmatter

Computer Animation

Frontmatter

Temporal Parameter-Free Deep Skinning of Animated Meshes

The Impact of Animations in the Perception of a Simulated Crowd

Computer Vision

Frontmatter

Virtual Haptic System for Shape Recognition Based on Local Curvatures

Stable Depth Estimation Within Consecutive Video Frames

Progressive Multi-scale Reconstruction for Guided Depth Map Super-Resolution via Deep Residual Gate Fusion Network

SE_EDNet: A Robust Manipulated Faces Detection Algorithm

PointCNN-Based Individual Tree Detection Using LiDAR Point Clouds

Variance Weight Distribution Network Based Noise Sample Learning for Robust Person Re-identification

Monocular Dense SLAM with Consistent Deep Depth Prediction

3D Shape-Adapted Garment Generation with Sketches

Geometric Computing

Frontmatter

Light-Weight Multi-view Topology Consistent Facial Geometry and Reflectance Capture

Real-Time Fluid Simulation with Atmospheric Pressure Using Weak Air Particles

Human Poses and Gestures

Frontmatter

Reinforcement Learning for Quadruped Locomotion

Partially Occluded Skeleton Action Recognition Based on Multi-stream Fusion Graph Convolutional Networks

Social-Scene-Aware Generative Adversarial Networks for Pedestrian Trajectory Prediction

Image Processing

Frontmatter

Cecid Fly Defect Detection in Mangoes Using Object Detection Frameworks

Twin-Channel Gan: Repair Shape with Twin-Channel Generative Adversarial Network and Structural Constraints

CoPaint: Guiding Sketch Painting with Consistent Color and Coherent Generative Adversarial Networks

Multi-Stream Fusion Network for Multi-Distortion Image Super-Resolution

Generative Face Parsing Map Guided 3D Face Reconstruction Under Occluded Scenes

Compact Double Attention Module Embedded CNN for Palmprint Recognition

M2M: Learning to Enhance Low-Light Image from Model to Mobile FPGA

Character Flow Detection and Rectification for Scene Text Spotting

A Deep Learning Method for 2D Image Stippling

Medical Imaging

Frontmatter

In Silico Heart Versatile Graphical Interface with Systole and Diastole Phases Customizable for Diversified Arrhythmias Simulations

ADD-Net:Attention U-Net with Dilated Skip Connection and Dense Connected Decoder for Retinal Vessel Segmentation

BDFNet: Boundary-Assisted and Discriminative Feature Extraction Network for COVID-19 Lung Infection Segmentation

A Classification Network for Ocular Diseases Based on Structure Feature and Visual Attention

Physics-Based Simulation

Frontmatter

DSNet: Dynamic Skin Deformation Prediction by Recurrent Neural Network

Curvature Analysis of Sculpted Hair Meshes for Hair Guides Generation

Synthesizing Human Faces Using Latent Space Factorization and Local Weights

CFMNet: Coarse-to-Fine Cascaded Feature Mapping Network for Hair Attribute Transfer

Rendering and Textures

Frontmatter

Dynamic Shadow Synthesis Using Silhouette Edge Optimization

DDISH-GI: Dynamic Distributed Spherical Harmonics Global Illumination

Simplicity Driven Edge Refinement and Color Reconstruction in Image Vectorization

Temporal-Consistency-Aware Video Color Transfer

An Improved Advancing-front-Delaunay Method for Triangular Mesh Generation

Robotics and Vision

Frontmatter

Does Elderly Enjoy Playing Bingo with a Robot? A Case Study with the Humanoid Robot Nadine

Resilient Navigation Among Dynamic Agents with Hierarchical Reinforcement Learning

Visual Analytics

Frontmatter

MeshChain: Secure 3D Model and Intellectual Property management Powered by Blockchain Technology

Image Emotion Analysis Based on the Distance Relation of Emotion Categories via Deep Metric Learning

How Much Do We Perceive Geometric Features, Personalities and Emotions in Avatars?

High-Dimensional Dataset Simplification by Laplace-Beltrami Operator

VR/AR

Frontmatter

Characterizing Visual Acuity in the Use of Head Mounted Displays

Effects of Different Proximity-Based Feedback on Virtual Hand Pointing in Virtual Reality

Virtual Scenes Construction Promotes Traditional Chinese Art Preservation

A Preliminary Work: Mixed Reality-Integrated Computer-Aided Surgical Navigation System for Paranasal Sinus Surgery Using Microsoft HoloLens 2

Engage

Frontmatter

Algorithms for Multi-conditioned Conic Fitting in Geometric Algebra for Conics

Special Affine Fourier Transform for Space-Time Algebra Signals

On Explicit Formulas for Characteristic Polynomial Coefficients in Geometric Algebras

Unified Expression Frame of Geodetic Stations Based on Conformal Geometric Algebra

Never ‘Drop the Ball’ in the Operating Room: An Efficient Hand-Based VR HMD Controller Interpolation Algorithm, for Collaborative, Networked Virtual Environments