Graphs from Features: Tree-based Graph Layout for Feature Analysis

Rosane Minghim, Liz Huancapaza, Erasmo Artur, Guilherme P. Telles and Ivar V. Belizario

Source Code Download (Updated on 06/06/2021) PDF (Open Access)


You can cite our paper using the bibtex reference below:

@Article{minghim2020gff,
						AUTHOR = {Minghim, Rosane and Huancapaza, Liz and Artur, Erasmo and Telles, Guilherme P. and Belizario, Ivar V.},
						TITLE = {Graphs from Features: Tree-Based Graph Layout for Feature Analysis},
						JOURNAL = {Algorithms},
						VOLUME = {13},
						YEAR = {2020},
						NUMBER = {11},
						ARTICLE-NUMBER = {302},
						URL = {https://www.mdpi.com/1999-4893/13/11/302},
						ISSN = {1999-4893},
						ABSTRACT = {Feature Analysis has become a very critical task in data analysis and visualization. Graph structures are very flexible in terms of representation and may encode important information on features but are challenging in regards to layout being adequate for analysis tasks. In this study, we propose and develop similarity-based graph layouts with the purpose of locating relevant patterns in sets of features, thus supporting feature analysis and selection. We apply a tree layout in the first step of the strategy, to accomplish node placement and overview based on feature similarity. By drawing the remainder of the graph edges on demand, further grouping and relationships among features are revealed. We evaluate those groups and relationships in terms of their effectiveness in exploring feature sets for data analysis. Correlation of features with a target categorical attribute and feature ranking are added to support the task. Multidimensional projections are employed to plot the dataset based on selected attributes to reveal the effectiveness of the feature set. Our results have shown that the tree-graph layout framework allows for a number of observations that are very important in user-centric feature selection, and not easy to observe by any other available tool. They provide a way of finding relevant and irrelevant features, spurious sets of noisy features, groups of similar features, and opposite features, all of which are essential tasks in different scenarios of data analysis. Case studies in application areas centered on documents, images and sound data demonstrate the ability of the framework to quickly reach a satisfactory compact representation from a larger feature set.},
						DOI = {10.3390/a13110302}
					}