PVIS 2021

Session 1: Machine Learning and Automated Visualization

1. Visual Analysis on Machine Learning Assisted Prediction of Ionic Conductivity for Solid-State Electrolytes [note][vimeo]

Hui Shao	University of Eletronic Science and Tech of China
Jiansu Pu	University of Eletronic Science and Tech of China
Dr. Yanlin Zhu	Shenzhen Clean Energy Research Institute
Boyang Gao	University of Eletronic Science and Tech of China
Zhengguo Zhu	University of Eletronic Science and Tech of China
Yunbo Rao	University of Eletronic Science and Tech of China

Abstract: Lithium ion batteries (LIBs) are widely used as the important energy sources in our daily life such as mobile phones, electric vehicles, and drones etc. Due to the potential safety risks caused by liquid electrolytes, the experts have tried to replace liquid electrolytes with solid ones. However, it is very difficult to find suitable alternatives materials in traditional ways for its incredible high cost in searching. Machine learning (ML) based methods are currently introduced and used for material prediction. But there is rarely an assisting learning tools designed for domain experts for institutive performance comparison and analysis of ML model. In this case, we propose an interactive visualization system for experts to select suitable ML models, understand and explore the predication results comprehensively. Our system employs a multi-faceted visualization scheme designed to support analysis from the perspective of feature composition, data similarity, model performance, and results presentation. A case study with real experiments in lab has been taken by the expert and the results of confirmed the effectiveness and helpfulness of our system.

2. A Machine Learning Approach for Predicting Human Preference for Graph Layouts[note] [vimeo] [Best]

Shijun Cai	School of IT, University of Sydney
Seok-Hee Hong	School of IT, University of Sydney
Jialiang Shen	School of IT, University of Sydney
Tongliang Liu	School of IT, University of Sydney

Abstract: Understanding what graph layout human prefer and why they prefer such graph layout is significant and challenging due to the highly complex visual perception and cognition system in human brain. In this paper, we present the first machine learning approach for predicting human preference for graph layouts.
In general, the data sets with human preference labels are limited and insufficient for training deep networks. To address this, we train our deep learning model by employing the transfer learning method, e.g., exploiting the quality metrics, such as shape-based metrics, edge crossing and stress, which are shown to be correlated to human preference on graph layouts. Experimental results using the ground truth human preference data sets show that our model can successfully predict human preference for graph layouts. To our best knowledge, this is the first approach for predicting qualitative evaluation of graph layouts using human preference experiment data.

3. ADVISor: Automatic Visualization Answer for Natural-Language Question on Tabular Data [paper] [vimeo]

Can Liu	Peking University, Beijing, China
Yun Han	Peking University, Beijing, China
Ruike Jiang	Peking University, Beijing, China
Xiaoru Yuan	Peking University, Beijing, China

Abstract: We propose an automatic pipeline to generate visualization with annotations to answer the natural-language questions raised by the public on tabular data. With a pre-trained language representation model, the input natural language questions and table headers are first encoded into vectors. According to these vectors, a multi-task end-to-end deep neural network extracts the related data areas and the corresponding aggregation type.
We present the result with carefully designed visualization and annotations for different attribute types and tasks. We conducted a comparison experiment with state-of-the-art works and the best commercial tools. The results show that our method outperforms those works with higher accuracy and more effective visualization.

4. Automatic Generation of Unit Visualization-based Scrollytelling for Impromptu Data Facts Delivery [paper]

Junhua Lu	State Key Lab of CAD&CG, Zhejiang University, Hangzhou, Zhejiang, China
Wei Chen	State Key Lab of CAD&CG, Zhejiang University, Hangzhou, Zhejiang, China
Hui Ye	State Key Lab of CAD&CG, Zhejiang University, Hangzhou, Zhejiang, China
Jie Wang	Alibaba Group, Hangzhou, Zhejiang, China
Honghui Mei	State Key Lab of CAD&CG, Zhejiang University, Hangzhou, Zhejiang, China
Yuhui Gu	State Key Lab of CAD&CG, Zhejiang University, Hangzhou, Zhejiang, China
Yingcai Wu	State Key Lab of CAD&CG, Zhejiang University, Hangzhou, Zhejiang, China
Xiaolong (Luke) Zhang	University Park, Pennsylvania, United States
Kwan-Liu Ma	University of California at Davis, Davis, California, United States

Abstract: Data-driven scrollytelling has become a prevalent way of visual communication because of its comprehensive delivery of perspectives derived from the data.
However, creating an expressive scrollytelling story requires both data and design literacy and is time-consuming. As a result, scrollytelling has been mainly used only by professional journalists to disseminate opinions. In this paper, we present an automatic method to generate expressive scrollytelling visualization, which can present easy-to-understand data facts through a carefully arranged sequence of views. The method first enumerates data facts of a given dataset and scores and organizes them. The facts are further assembled, sequenced into a story, with reader input taken into consideration. Finally, visual graphs, transitions, and text descriptions are generated to synthesize the scrollytelling visualization. In this way, non-professionals can easily explore and share interesting perspectives from selected data attributes and fact types. We demonstrate the effectiveness and usability of our method through both use cases and an in-lab user study.

5. Parsing and Summarizing Infographics with Synthetically Trained Icon Detection [paper] [vimeo]

Spandan Madan	Harvard University, Cambridge, Massachusetts, United States
Zoya Bylinskii	Creative Intelligence Lab, Adobe Research, Cambridge, Massachusetts, United States
Carolina Nobre	Harvard University, Cambridge, Massachusetts, United States
Matthew Tancik	UC Berkeley, Berkeley, California, United States
Adria Recasens	Massachusetts Institute of Technology, Cambridge, Massachusetts, United States
Kimberli Zhong	Massachusetts Institute of Technology, Cambridge, Massachusetts, United States
Sami Alsheikh	Massachusetts Institute of Technology, Cambridge, Massachusetts, United States
Aude Oliva	Massachusetts Institute of Technology, Cambridge, Massachusetts, United States
Frédo Durand	Massachusetts Institute of Technology, Cambridge, Massachusetts, United States
Professor Hanspeter Pfister	Visual Computing Group, Harvard University, Cambridge, Massachusetts, United States

Abstract: Widely used in news, business, and educational media, infographics are handcrafted to effectively communicate messages about complex and often abstract topics including 'ways to conserve the environment' and 'understanding the financial crisis'. The computational understanding of infographics required for future applications like automatic captioning, summarization, search, and question-answering, will depend on being able to parse the visual and textual elements contained within. However, being composed of stylistically and semantically diverse visual and textual elements, infographics pose challenges for current A.I. systems. While automatic text extraction works reasonably well on infographics, standard object detection algorithms fail to identify the stand-alone visual elements in infographics that we refer to as 'icons'. In this paper, we propose a novel approach to train an object detector using synthetically-generated data, and show that it succeeds at generalizing to detecting icons within in-the-wild infographics. We further pair our icon detection approach with an icon classifier and a state-of-the-art text detector to demonstrate three demo applications: topic prediction, multi-modal summarization, and multi-modal search. Parsing the visual and textual elements within infographics provides us with the first steps towards automatic infographic understanding.

Session 2: Temporal and Spatio-Temporal Data

1. Visualising Temporal Uncertainty: A Taxonomy and Call for Systematic Evaluation[note][vimeo]

Yashvir Singh Grewal	Monash University
Sarah Goodwin	Monash University
Professor Tim Dwyer	Data Visualisation and Immersive Analytics, Monash University

Abstract: Increased reliance on data in decision-making has highlighted the importance of conveying uncertainty in data visualisations. Yet developing visualisation techniques that clearly and accurately convey uncertainty in data is an open challenge across a variety of fields.This is especially the case when visualising temporal uncertainty.To facilitate the development of innovative and accessible temporal uncertainty visualisation techniques and respond to an identified gap in the literature, we propose the first-ever survey of over 50temporal uncertainty visualisation techniques deployed in numerous fields. Our paper offers two contributions. First, we propose a novel taxonomy to be applied when classifying temporal uncertainty visualisation techniques. This takes into account the visualisation’s intended audience, as well as its level of discreteness in representing uncertainty. Second, we urge researchers and practitioners to use a greater variety of visualisations which differ in terms of their discreteness. In doing so, we believe that a more robust evaluation of visualisation techniques can be achieved.

2. An Extension of Empirical Orthogonal Functions for the Analysis of Time-Dependent 2D Scalar Field Ensembles[note] [vimeo]

Dominik Vietinghoff	Leipzig University, Leipzig, Germany
Dr. Christian Heine	Leipzig University, Leipzig, Germany
Michael Böttinger	German Climate Computing Center (DKRZ)
Prof. Dr. Gerik Scheuermann	Institute of Computer Science, Leipzig University

Abstract: To assess the reliability of weather forecasts and climate simulations, common practice is to generate large ensembles of numerical simulations. Analyzing such data is challenging and requires pattern and feature detection. For single time-dependent scalar fields, empirical orthogonal functions (EOFs) are a proven means to identify the main variation. In this paper, we present an extension of that concept to time-dependent ensemble data. We applied our methods to two ensemble data sets from climate research in order to investigate the North Atlantic Oscillation (NAO) and East Atlantic (EA) pattern.

3. NetScatter: Visual analytics of multivariate time series with a hybrid of dynamic and static variable relationships [paper] [vimeo]

Bao Dien Quoc Nguyen	IDV lab, Texas Tech University, Lubbock, Texas, United States
Rattikorn Hewett	Department of Computer Science, Texas Tech University, Lubbock, Texas, United States
Tommy Dang	IDV lab, Texas Tech University, Lubbock, Texas, United States

Abstract: The ability to capture common characteristics among complex multivariate time series variables can have a profound impact on big data analytics in uncovering useful insights into the relationships among them and enabling a dimension reduction technique. This paper presents NetScatter, a visual analytic approach to characterizing changes in relationships between each pair of variables in a high-dimensional time series. While time series focus on the dynamics of a single variable, scatter plots focus on static relationships between two variables. Unlike most traditional approaches that employ a single perspective of the visual display, our approach combines static perspectives of two variables in multivariate time series into a single representation by comparing all data instances over two different time steps. The paper also introduces a list of visual features of the representation to capture how overall data evolve. We have implemented a web-based prototype that supports a full range of operations, such as ranking, filtering, and details on demand. The paper illustrates the proposed approach on data of various sizes in different domains to demonstrate its benefits.

4. Stable Visual Summaries for Spatio-Temporal Data [paper] [vimeo]

Jules Wulms	TU Wien, Vienna, Austria
Juri Buchmuller	University of Konstanz, Konstanz, Germany
Wouter Meulemans	TU Eindhoven, Eindhoven, Netherlands
Kevin Verbeek Eindhoven	University of Technology, Eindhoven, Netherlands
Bettina Speckmann Eindhoven	University of Technology, Eindhoven, Netherlands

Abstract: The availability of devices that track moving objects has led to an explosive growth in trajectory data. When exploring the resulting large trajectory collections, visual summaries are a useful tool to identify time intervals of interest. A typical approach is to represent the spatial positions of the tracked objects at each time step via a one-dimensional ordering; visualizations of such orderings can then be placed in temporal order along a time line.
There are two main criteria to assess the quality of the resulting visual summary: spatial quality -- how well does the ordering capture the structure of the data at each time step, and stability -- how coherent are the orderings over consecutive time steps or temporal ranges? In this paper we introduce a new Stable Principal Component (SPC) method to compute such orderings, which is explicitly parameterized for stability, allowing a trade-off between the spatial quality and stability. We conduct extensive computational experiments that quantitatively compare the orderings produced by ours and other stable dimensionality-reduction methods to various state-of-the-art approaches using a set of well-established quality metrics that capture spatial quality and stability. We conclude that stable dimensionality reduction outperforms existing methods on stability, without sacrificing spatial quality or efficiency; in particular, our new SPC method does so at a fraction of the computational costs.

5. Visual Analysis of Spatio-Temporal Trends in Time-Dependent Ensemble Data Sets on the Example of the North Atlantic Oscillation [paper] [vimeo]

Dominik Vietinghoff	Leipzig University, Leipzig, Germany
Dr. Christian Heine	Leipzig University, Leipzig, Germany
Michael Böttinger	German Climate Computing Center (DKRZ), Hamburg, Germany
Dr Nicola Maher Maher	Ocean in the Earth System, Max Planck Institute for Meteorology, Hamburg, Germany
Dr Johann H Jungclaus	Ocean in the Earth System, Max Planck Institute for Meteorology, Hamburg, Germany
Prof. Dr. Gerik Scheuermann	Institute of Computer Science, Leipzig University, Leipzig, Germany

Abstract: A driving factor of the winter weather in Western Europe is the North Atlantic Oscillation (NAO), manifested by fluctuations in the difference of sea level pressure between the Icelandic Low and the Azores High. Different methods have been developed that describe the strength of this oscillation, but they rely on certain assumptions, e.g., fixed positions of these two pressure systems. It is possible that climate change affects the mean location of both the Low and the High and thus the validity of these descriptive methods. This study is the first to visually analyze large ensemble climate change simulations (the MPI Grand Ensemble) to robustly assess shifts of the drivers of the NAO phenomenon using the uncertain northern hemispheric surface pressure fields. For this, we use a sliding window approach and compute empirical orthogonal functions (EOFs) for each window and ensemble member, then compare the uncertainty of local extrema in the results as well as their temporal evolution across different CO2 scenarios. We find systematic northeastward shifts in the location of the pressure systems that correlate with the simulated warming. Applying visualization techniques for this analysis was not straightforward; we reflect and give some lessons learned for the field of visualization.

Session 3: Applications and Infovis

1. Visualization Support for Multi-criteria Decision Making in Software Issue Propagation[note] [vimeo]

Youngtaek Kim	Department of Computer Science and Engineering, Seoul National University
Hyeon Jeon	Department of Computer Science and Engineering, Seoul National University
Young-Ho Kim	College of Information Studies, University of Maryland
Yuhoon Ki	Software Development Team, Samsung Electronics
Hyunjoo Song	School of Computer Science and Engineering, Soongsil University
Prof. Jinwook Seo	Department of Computer Science and Engineering, Seoul National University

Abstract: Finding the propagation scope for various types of issues in Software Product Lines (SPLs) is a complicated Multi-Criteria Decision Making (MCDM) problem. This task often requires human-in-the-loop data analysis, which covers not only multiple product attributes but also contextual information (e.g., internal policy, customer requirements, exceptional cases, cost efficiency). We propose an interactive visualization tool to support MCDM tasks in software issue propagation based on the user's mental model. Our tool enables users to explore multiple criteria with their insight intuitively and find the appropriate propagation scope.

2. Know-What and Know-Who: Document Searching and Exploration using Topic-Based Two-Mode Networks[note] [vimeo]

Jian Zhao	School of Computer Science, University of Waterloo
Maoyuan Sun	Computer Science, Northern Illinois University
Patrick Chiu	FX Palo Alto Laboratory, Palo Alto
Francine Chen	FX Palo Alto Laboratory, Palo Alto
Bee Liew	FX Palo Alto Laboratory, Palo Alto

Abstract: This paper proposes a novel approach for analyzing search results of a document collection, which can help support know-what and know- who information seeking questions. Search results are grouped by topics, and each topic is represented by a two-mode network composed of related documents and authors (i.e., biclusters). We visualize these biclusters in a 2D layout to support interactive visual exploration of the analyzed search results, which highlights a novel way of organizing entities of biclusters. We evaluated our approach using a large academic publication corpus, by testing the distribution of the relevant documents and of lead and prolific authors. The results indicate the effectiveness of our approach compared to traditional 1D ranked lists. Moreover, a user study with 12 participants was conducted to compare our proposed visualization, a simplified variation without topics, and a text-based interface. We report on participants’ task performance, their preference of the three interfaces, and the different strategies used in information seeking.

3. Context-Responsive Labeling in Augmented Reality [paper] [vimeo]

Thomas Köppel	Institute of Visual Computing & Human-Centered Technology, TU Wien, Vienna, Austria
Eduard Gröller	Institute of Visual Computing & Human-Centered Technology, TU Wien, Vienna, Austria
Hsiang-Yun Wu	Institute of Visual Computing & Human-Centered Technology, TU Wien, Vienna, Austria

Abstract: Route planning and navigation are common tasks that often require additional information on points of interest. Augmented Reality (AR) enables mobile users to reside text labels, in order to provide a composite view associated with additional information in a realworld environment. Nonetheless, displaying all labels for points of interest on a mobile device will lead to unwanted overlaps between information, and thus a context-responsive strategy to properly arrange labels is expected. The technique should remove overlaps, show the right level-of-detail, and maintain label coherence. This is necessary as the viewing angle in an AR system may change rapidly due to users’ behaviors. Coherence plays an essential role in retaining user experience and knowledge, as well as avoiding motion sickness. In this paper, we develop an approach that systematically manages label visibility and levels-of-detail, as well as eliminates unexpected incoherent movement. We introduce three label management strategies, including (1) occlusion management, (2) level-of-detail management, and (3) coherence management by balancing the usage of the mobile phone screen. A greedy approach is developed for fast occlusion handling in AR. A level-of-detail scheme is adopted to arrange various types of labels. A 3D scene manipulation is then built to simultaneously suppress the incoherent behaviors induced by viewing angle changes. Finally, we present our the feasibility and applicability of our approach through one synthetic and two real-world scenarios, followed by a qualitative user study.

4. Mapper Interactive: A Scalable, Extendable, and Interactive Toolbox for the Visual Exploration of High-Dimensional Data [paper] [vimeo]

Youjia Zhou	University of Utah, Salt Lake City, Utah, United States
Nithin Chalapathi	University of Utah, Salt Lake City, Utah, United States
Archit Rathore	University of Utah, Salt Lake City, Utah, United States
Yaodong Zhao	University of Utah, Salt Lake City, Utah, United States
Bei Wang	Scientific Computing and Imaging Institute, Salt Lake City, Utah, United States

Abstract: The mapper algorithm is a popular tool from topological data analysis for extracting topological summaries of high-dimensional datasets. In this paper, we present Mapper Interactive, a web-based framework for the interactive analysis and visualization of high- dimensional point cloud data. It implements the mapper algorithm in an interactive, scalable, and easily extendable way, thus support- ing practical data analysis. In particular, its command-line API can compute mapper graphs for 1 million points of 256 dimensions in about 3 minutes (4 times faster than the vanilla implementation). Its visual interface allows on-the-fly computation and manipulation of the mapper graph based on user-specified parameters and supports the addition of new analysis modules with a few lines of code. Mapper Interactive makes the mapper algorithm accessible to nonspecialists and accelerates topological analytics workflows.

5. Tac-Miner: Visual Tactic Mining for Multiple Table Tennis Matches [paper] [Honorable Mention]

Jiachen Wang	State Key Lab of CAD&CG, Zhejiang University, Hangzhou, Zhejiang, China
Jiang Wu	State Key Lab of CAD&CG, Zhejiang University, Hangzhou, Zhejiang, China
Anqi Cao	State Key Lab of CAD&CG, Zhejiang University, Hangzhou, Zhejiang, China
Zheng Zhou	Department of Sport Science, College of Education, Hangzhou, CHN/Zhejiang, China
Hui Zhang	Department of Sport Science, College of Education, Hangzhou, CHN/Zhejiang, China
Yingcai Wu	State Key Lab of CAD&CG, Zhejiang University, Hangzhou, Zhejiang, China

Abstract: In table tennis, tactics specified by three consecutive strokes represent the high-level competition strategies in matches. Effective detection and analysis of tactics can reveal the playing styles of players, as well as their strengths and weaknesses. However, tactical analysis in table tennis is challenging as the analysts can often be overwhelmed by the large quantity and high dimension of the data. Statistical charts have been extensively used by researchers to explore and visualize table tennis data. However, these charts cannot support efficient comparative and correlation analysis of complicated tactic attributes. Besides, existing studies are limited to the analysis of one match. However, one player's strategy can change along with his/her opponents in different matches. Therefore, the data of multiple matches can support a more comprehensive tactical analysis. To address these issues, we introduced a visual analytics system called Tac-Miner to allow analysts to effectively analyze, explore, and compare tactics of multiple matches based on the advanced embedding and dimension reduction algorithms along with an interactive glyph. We evaluate our glyph's usability through a user study and demonstrate the system's usefulness through a case study with insights approved by coaches and domain experts.

Session 4: SciVis and Visual Comparison

1. Mixed-Initiative Approach to Extract Data from Pictures of Medical Invoice[note][vimeo]

Seokweon Jung	Department of Computer Science and Engineering, Seoul National University
Kiroong Choe	Department of Computer Science and Engineering, Seoul National University
Seokhyeon Park	Department of Computer Science and Engineering, Seoul National University
Hyung-Kwon Ko	Department of Computer Science and Engineering, Seoul National University
Youngtaek Kim	Department of Computer Science and Engineering, Seoul National University
Prof. Jinwook Seo	Department of Computer Science and Engineering, Seoul National University

Abstract: Extracting data from pictures of medical records is a common task in the insurance industry as the patients often send their medical invoices taken by smartphone cameras. However, the overall process is still challenging to be fully automated because of low image quality and variation of templates that exist in the status quo. In this paper, we propose a mixed-initiative pipeline for extracting data from pictures of medical invoices, where deep-learning-based automatic prediction models and task-specific heuristics work together under the mediation of a user. In the user study with 12 participants, we confirmed our mixed-initiative approach can supplement the drawbacks of a fully automated approach within an acceptable completion time. We further discuss the findings, limitations, and future works for designing a mixed-initiative system to extract data from pictures of a complicated table.

2. Asynchronous and Load-Balanced Union-Find for Distributed and Parallel Scientific Data Visualization and Analysis [paper] [vimeo] [Best]

Jiayi Xu	The Ohio State University, Columbus, Ohio, United States
Hanqi Guo	Argonne National Laboratory, Lemont, Illinois, United States
Han-Wei Shen	The Ohio State University, Columbus , Ohio, United States
Mukund Raj	Argonne National Laboratory, Lemont, Illinois, United States
Xueqiao Xu	Lawrence Livermore National Laboratory, Livermore, California, United States
Xueyun Wang	Peking University, Beijing, China
Dr. Zhehui Wang	MS H846, Los Alamos National Laboratory, Los Alamos, New Mexico, United States
Tom Peterka Argonne	National Laboratory, Lemont, Illinois, United States

Abstract: We present a novel distributed union-find algorithm that features asynchronous parallelism and k-d tree based load balancing for scalable visualization and analysis of scientific data. Applications of union-find include level set extraction and critical point tracking, but distributed union-find can suffer from high synchronization costs and imbalanced workloads across parallel processes. In this study, we prove that global synchronizations in existing distributed union-find can be eliminated without changing final results, allowing overlapped communications and computations for scalable processing. We also use a k-d tree decomposition to redistribute inputs, in order to improve workload balancing. We benchmark the scalability of our algorithm with up to 1,024 processes using both synthetic and application data. We demonstrate the use of our algorithm in critical point tracking and super-level set extraction with high-speed imaging experiments and fusion plasma simulations, respectively.

3. SurfRiver: Flattening Stream Surfaces for Comparative Visualization [paper] [vimeo]

Jun Zhang	Computer Science and Engineering, University of Notre Dame, Notre Dame, Indiana, United States
Jun Tao	School of Data and Computer Science, Sun Yat-sen University, Guangzhou, Guangdong, China
Jian-Xun Wang	University of Notre Dame, Notre Dame, Indiana, United States
Chaoli Wang	Computer Science and Engineering, University of Notre Dame, Notre Dame, Indiana, United States

Abstract: We present SurfRiver, a new visual transformation approach that flattens stream surfaces in 3D to rivers in 2D for comparative visualization. Leveraging the TextFlow-like visual metaphor, SurfRiver untangles the convoluted individual stream surfaces along the flow direction and maps them along the horizontal direction of the abstract river view. It stacks multiple surfaces along the vertical direction of the river view. This visual mapping makes it easy for users to track along the flow direction and align stream surfaces for comparative study. Through brushing and linking, the river view is connected to the spatial surface view for collective reasoning. SurfRiver can be used to examine a single stream surface, investigate seeding sensitivity or variability of a family of surfaces from a group of related seeding curves, or explore a collection of representative surfaces. We describe our optimization solution to achieve the desirable mapping, present SurfRiver interface and interactions, and report results from different flow fields to demonstrate its efficacy. Feedback from a domain expert also indicates the promise of SurfRiver.

4. FiberStars: Visual Comparison of Diffusion Tractography Data between Multiple Subjects [paper] [vimeo]

Loraine Franke	University of Massachusetts Boston, Boston, Massachusetts, United States
Daniel Karl I. Weidele	University of Konstanz, Konstanz, Germany
Fan Zhang	Harvard Medical School, Cambridge, Massachusetts
Suheyla Cetin Karayumak	Harvard Medical School, Cambridge, Massachusetts
Steve Pieper PhD	Harvard Medical School, Cambridge, Massachusetts
Lauren J. O'Donnell	Harvard Medical School, Cambridge, Massachusetts
Yogesh Rathi	Harvard Medical School, Cambridge, Massachusetts
Daniel Haehn	University of Massachusetts Boston, Boston, Massachusetts, United States

Abstract: Tractography from high-dimensional diffusion magnetic resonance imaging (dMRI) data allows brain's structural connectivity analysis. Recent dMRI studies aim to compare connectivity patterns across subject groups and disease populations to understand subtle abnormalities in the brain's white matter connectivity and distributions of biologically sensitive dMRI derived metrics. Existing software products focus solely on the anatomy, are not intuitive or restrict the comparison of multiple subjects. In this paper, we present the design and implementation of FiberStars, a visual analysis tool for tractography data that allows the interactive visualization of brain fiber clusters combining existing 3D anatomy with compact 2D visualizations. With FiberStars, researchers can analyze and compare multiple subjects in large collections of brain fibers using different views. To evaluate the usability of our software, we performed a quantitative user study. We asked domain experts and non-experts to find patterns in a tractography dataset with either FiberStars or an existing dMRI exploration tool. Our results show that participants using FiberStars can navigate extensive collections of tractography faster and more accurately. All our research, software, and results are available openly.

Session 5: Visual Perception and Design

1. Unravelling the Human Perspective and Considerations for Urban Data Visualization[note] [vimeo]

Sarah Goodwin	Monash University, Melbourne, Australia
Sebastian Meier	HafenCity University Hamburg, CityScienceLab
Dr. Lyn Bartram	School of Interactive Art and Technology, Simon Fraser University
Alex Godwin	Computer Science, American University, Washington
Till Nagel	University of Applied Sciences Mannheim
Marian Dörk	UCLAB, University of Applied Sciences Potsdam

Abstract: Effective use of data is an essential asset to modern cities. Visualization as a tool for analysis, exploration, and communication has become a driving force in the task of unravelling our complex urban fabrics. This paper outlines the findings from a series of three workshops from 2018-2020 bringing together experts in urban data visualization with the aim of exploring multidisciplinary perspectives from the human-centric lens. Based on the rich and detailed workshop discussions identifying challenges and opportunities for urban data visualization research, we outline major human-centric themes and considerations fundamental for CityVis design and introduce a framework for an urban visualization design space.

2. Exploratory User Study on Graph Temporal Encodings[note][vimeo]

Velitchko Andreev Filipov	TU Wien, Institute of Visual Computing and Human-Centered Technology
Alessio Arleo	Institute of Visual Computing & Human-Centered Technology, TU Wien
Silvia Miksch	TU Wien, Institute of Visual Computing and Human-Centered Technology

Abstract: A temporal graph stores and reflects temporal information associated with its entities and relationships. Such graphs can be utilized to model a broad variety of problems in a multitude of domains. Researchers from different fields of expertise are increasingly applying graph visualization and analysis to explore unknown phenomena, complex emerging structures, and changes occurring over time in their data. While several empirical studies evaluate the benefits and drawbacks of different network representations, visualizing the temporal dimension in graphs still presents an open challenge. In this paper, we propose an exploratory user study with the aim of evaluating different combinations of graph representations, namely node-link and adjacency matrix, and temporal encodings, such as superimposition, juxtaposition and animation, on typical temporal tasks. The study participants expressed positive feedback toward matrix representations, with generally quicker and more accurate responses than with the node-link representation.

3. On the Readability of Abstract Set Visualizations [paper] [vimeo]

Markus Wallinger	Algorithms and Complexity Group, TU Wien, Vienna, Austria
Ben Jacobsen	University of Arizona, Tucson, Arizona, United States
Stephen Kobourov	Computer Science, University of Arizona, Tucson, Arizona, United States
Martin Nöllenburg	Algorithms and Complexity Group, TU Wien, Vienna, Austria

Abstract: Set systems are used to model data that naturally arises in many contexts: social networks have communities, musicians have genres, and patients have symptoms.
Visualizations that accurately reflect the information in the underlying set system make it possible to identify the set elements, the sets themselves, and the relationships between the sets.
In static contexts, such as print media or infographics, it is necessary to capture this information without the help of interactions.
With this in mind, we consider three different systems for medium-sized set data, LineSets, EulerView, and MetroSets, and report the results of a controlled human-subjects experiment comparing their effectiveness.
Specifically, we evaluate the performance, in terms of time and error, on tasks that cover the spectrum of static set-based tasks.
We also collect and analyze qualitative data about the three different visualization systems. Our results include statistically significant differences, suggesting that MetroSets performs and scales better.

4. Smile or Scowl? Looking at Infographic Design Through the Affective Lens [paper] [vimeo]

Xingyu Lan	Intelligent Big Data Visualization Lab, Tongji University, Shanghai, China
Yang Shi	Intelligent Big Data Visualization Lab, Tongji University, Shanghai, China
Yueyao Zhang	Intelligent Big Data Visualization Lab, Tongji University, Shanghai, China
Prof. Nan Cao	Intelligent Big Data Visualization Lab, Tongji University, Shanghai, China

Abstract: Infographics are frequently promoted for their ability to communicate data to audiences affectively. To facilitate the creation of affect-stirring infographics, it is important to characterize and understand people's affective responses to infographics and derive practical design guidelines for designers. To address these research questions, we first conducted two crowdsourcing studies to identify 12 infographic-associated affective responses and collect user feedback explaining what triggered affective responses in infographics. Then, by coding the user feedback, we present a taxonomy of design heuristics that exemplifies the affect-related design factors in infographics. We evaluated the design heuristics with 15 designers. The results showed that our work supports assessing the affective design in infographics and facilitates the ideation and creation of affective infographics.

5. On the Visualization of Hierarchical Multivariate Data [paper] [vimeo]

Boyan Zheng	Heidelberg University, Heidelberg, Germany Heidelberg University, Heidelberg, Germany
Filip Sadlo	Heidelberg University, Heidelberg, Germany Heidelberg University, Heidelberg, Germany

Abstract: In this paper, we study the visual design of hierarchical multivariate data analysis. We focus on the extension of four hierarchical univariate concepts—the sunburst chart, the icicle plot, the circular treemap, and the bubble treemap—to the multivariate domain. Our study identifies several advantageous design variants, which we discuss with respect to previous approaches, and whose utility we evaluate with a user study and demonstrate for different analysis purposes and different types of data.

Session 6: Graph Drawing

1. Sublinear-Time Attraction Force Computation for Large Complex Graph Drawing[note] [vimeo][Honorable Mention]

Amyra Meidiana	University of Sydney
Seok-Hee Hong	University of Sydney
Shijun Cai	University of Sydney
Marnijati Torkel	University of Sydney
Peter Eades	University of Sydney

Abstract: Recent works in graph visualization attempt to reduce the runtime of repulsion force computation of force-directed algorithms using sampling, however they fail to reduce the runtime for attraction force computation to sublinear in the number of edges.
We present new sublinear-time algorithms for the attraction force computation of force-directed algorithms and integrate them with sublinear-time repulsion force computation.
Extensive experiments show that our algorithms, operated as part of a fully sublinear-time force computation framework, compute graph layouts on average 80% faster than existing linear-time force computation algorithm, with surprisingly significantly better quality metrics on edge crossing and shape-based metrics.

2. Louvain-based Multi-level Graph Drawing[note][vimeo]

Seok-Hee Hong	University of Sydney
Peter Eades	University of Sydney
Marnijati Torkel	University of Sydney
James George Wood	University of Sydney
Kunsoo Park	Seoul National University

Abstract: The multi-level graph drawing is a popular approach to visualize large and complex graphs. It recursively coarsens a graph and then uncoarsens the drawing using layout refinement. In this paper, we leverage the Louvain community detection algorithm for the multilevel graph drawing paradigm.
More specifically, we present the Louvain-based multi-level graph drawing algorithm, and compare with other community detection algorithms such as Label Propagation and Infomap clustering. Experiments show that Louvain-based multi-level algorithm performs best in terms of efficiency (i.e., fastest runtime), while Label Propagation and Infomap-based multi-level algorithms perform better in terms of effectiveness (i.e., better visualization in quality metrics).

3. GDot: Drawing Graphs with Dots and Circles [paper] [vimeo]

Seok-Hee Hong	School of IT, University of Sydney, Sydney, NSW, Australia
Peter Eades	School of IT, University of Sydney, Sydney, NSW, Australia
Marnijati Torkel	School of IT, University of Sydney, Sydney, NSW, Australia

Abstract: This paper presents a new visual representation of graphs, inspired by the dot painting style of Central Australia. This painting style is established as a powerful medium for communicating information with abstraction, and has a long history of supporting storytelling.
We propose a general framework {\tt GDot} to visually represent information as dot paintings. We describe computational techniques as well as the rendering effects to produce painterly representations of graphs and networks. We present visualization examples with various networks from diverse domains, from pure mathematics to social systems.
Further, we briefly describe the extension of our dot painting visualization style to multi-dimensional data, dynamic data and geo-referenced data.

4. Sublinear-time Algorithms for Stress Minimization in Graph Drawing [paper] [vimeo]

Amyra Meidiana	University of Sydney, Sydney, Australia
James George Wood	University of Sydney, Sydney, Australia
Seok-Hee Hong	University of Sydney, Sydney, Australia

Abstract: We present algorithms reducing the runtime of the stress minimization iteration of stress-based layouts to sublinear in the number of vertices and edges. Specifically, we use vertex sampling to further reduce the number of vertex pairs considered in stress minimization iterations. Moreover, we use spectral sparsification to reduce the number of edges considered in stress minimization computations to sublinear in the number of edges, esp. for dense graphs.
Specifically, we present new pivot selection methods using importance-based sampling. Then, we present two variations of sublinear-time stress minimization method on two popular stress-based layouts, Stress Majorization and Stochastic Gradient Descent.
Experimental results demonstrate that our sublinear-time algorithms run, on average, about 35% faster than the state-of-art linear-time algorithms, while obtaining similar quality drawings based on stress and shape-based metrics.

Session 7: Visual Analytics

1. Papers101: Supporting the Discovery Process in the Literature Review Workflow for Novice Researchers[note][vimeo]

Kiroong Choe	Seoul National University
Seokweon Jung	Seoul National University
Seokhyeon Park	Seoul National University
Hwajung Hong	Seoul National University
Prof. Jinwook Seo	Seoul National University

Abstract: A literature review is a critical task in performing research. However, even browsing an academic database and choosing must-read items can be daunting for novice researchers. In this paper, we introduce Papers101, an interactive system that supports novice researchers' discovery of papers relevant to their research topics. Prior to system design, we performed a formative study to investigate what difficulties novice researchers often face and how experienced researchers address them. We found that novice researchers have difficulty in identifying appropriate search terms, choosing which papers to read first, and ensuring whether they have examined enough candidates. In this work, we identified key requirements for the system dedicated to novices: prioritizing search results, unifying the contexts of multiple search results, and refining and validating the search queries. Accordingly, Papers101 provides an opinionated perspective on selecting important metadata among papers. It also visualizes how the priority among papers is developed along with the users' knowledge discovery process. Finally, we demonstrate the potential usefulness of our system with the case study on the metadata collection of papers in visualization and HCI community.

2. Visual Analytics Methods for Interactively Exploring the Campus Lifestyle[note][Vimeo]

Liang Liu	Southwest University of Science and Technology
Song Wang	Southwest University of Science and Technology
Ting Cai	Southwest University of Science and Technology
Hanglin Li	Southwest University of Science and Technology
Weixin Zhao	Southwest University of Science and Technology
Yadong Wu	Sichuan University of Science and Engineering

Abstract: Exploring campus lifestyle is conducive to innovating education management, optimizing campus resources allocation, and providing personalized services, but little attention had been paid to the exploration campus lifestyle. A novel interactive system based on behavioral data of campus card is presented in this paper to provide new ideas and technical support for campus management. Interactive visualization techniques are utilized to help users analyze campus lifestyle via intelligible diagrams. The system contains three functional modules: providing a decision-making reference to educators on students' poverty subsidies, predicting students' academic performance by quantitative analysis, and scheduling cafeteria repast based on the scheduling model during the outbreak of COVID-19. Finally, three exploratory case studies are presented to demonstrate the effectiveness of the system.

3. Investigating the Evolution of Tree Boosting Models with Visual Analytics [paper] [vimeo]

Junpeng Wang	Visa Research, Palo Alto, California, United States
Wei Zhang	Visa Research, Palo Alto, California, United States
Liang Wang	Visa Research, Palo Alto, California, United States
Hao Yang	Visa Research, Palo Alto, California, United States

Abstract: Tree boosting models are widely adopted predictive models and have demonstrated superior performance than other conventional and even deep learning models, especially since the recent release of their parallel and distributed implementations, e.g., XGBoost, LightGMB, and CatBoost. Tree boosting uses a group of sequentially generated weak learners (i.e., decision trees), each learns from the mistakes of its predecessor, to push the model's decision boundary towards the true boundary. As the number of trees keeps increasing over training, it is important to reveal how the newly-added trees change the predictions of individual data instances, and how the impacts of different data features evolve. To accomplish these goals, in this paper, we introduce a new design of the temporal confusion matrix, providing users with an effective interface to track data instances' predictions across the tree boosting process. Also, we present an improved visualization to better illustrate and compare the impacts of individual data features (based on their SHAP values) across training iterations. Integrating these components with a tree structure visualization component, we propose a visual analytics system for tree boosting models. Through case studies with domain experts using real-world datasets, we validated the system's effectiveness.

4. A Visual Analytics Approach for the Diagnosis of Heterogeneous and Multidimensional Machine Maintenance Data [paper] [vimeo]

Xiaoyu Zhang	University of California, Davis, Davis, California, United States
Takanori Fujiwara	University of California, Davis, Davis, California, United States
Senthil Chandrasegaran	Faculty of Industrial Design Engineering, Delft University of Technology, Delft, Netherlands
Dr. Michael P. Brundage	National Institute of Standards and Technology, Gaithersburg, Maryland, United States
Thurston Sexton	National Institute of Standards and Technology, Gaithersburg, Maryland, United States
Mr. Alden Dima	National Institute of Standards and Technology, Gaithersburg, Maryland, United States
Kwan-Liu Ma	University of California, Davis, Davis, California, United States

Abstract: Analysis of large, high-dimensional, and heterogeneous datasets is challenging as no one technique is suitable for visualizing and clustering such data in order to make sense of the underlying information. For instance, heterogeneous logs detailing machine repair and maintenance in an organization often need to be analyzed to diagnose errors and identify abnormal patterns, formalize root-cause analyses, and plan preventive maintenance. Such real-world datasets are also beset by issues such as inconsistent and/or missing entries. To conduct an effective diagnosis, it is important to extract and understand patterns from the data with support from analytic algorithms (e.g., finding that certain kinds of machine complaints occur more in the summer) while involving the human-in-the-loop. To address these challenges, we adopt existing techniques for dimensionality reduction (DR) and clustering of numerical, categorical, and text data dimensions, and introduce a visual analytics approach that uses multiple coordinated views to connect DR + clustering results across each kind of the data dimension stated. To help analysts label the clusters, each clustering view is supplemented with techniques and visualizations that contrast a cluster of interest with the rest of the dataset. Our approach assists analysts to make sense of machine maintenance logs and their errors. Then the gained insights help them carry out preventive maintenance. We illustrate and evaluate our approach through use cases and expert studies respectively, and discuss generalization of the approach to other heterogeneous data.

5. KeywordMap: Attention-based Visual Exploration for Keyword Analysis [paper]

Yamei Tu	Computer Science and Engineering, The Ohio State University, Columbus, Ohio, United States
Jiayi Xu	Computer Science and Engineering, The Ohio State University, Columbus, Ohio, United States
Han-Wei Shen	Computer Science and Engineering, The Ohio State University, Columbus, Ohio, United States

Abstract: With the high growth rate of text data, extracting meaningful information from a large corpus becomes increasingly difficult. Keyword extraction and analysis is a common approach to tackle the problem, but it is non-trivial to identify important words in the text and represent the multifaceted properties of those words effectively. Traditional topic modeling based keyword analysis algorithms require hyper-parameters which are often difficult to tune without enough prior knowledge. In addition, the relationships among the keywords are often difficult to obtain. In this paper, we utilize the attention scores extracted from Transformer-based language models to capture word relationships.
We propose a domain-driven attention tuning method, guiding the attention to learn domain-specific word relationships. From the attention, we build a keyword network and propose a novel algorithm, Attention-based Word Influence (AWI), to compute how influential each word is in the network. An interactive visual analytics system, KeywordMap, is developed to support multi-level analysis of keywords and keyword relationships through coordinated views. We measure the quality of keywords captured by our AWI algorithm quantitatively. We also evaluate the usefulness and effectiveness of KeywordMap through case studies.