IEEE Trans Vis Comput Graph. 2018 Jan;24(1):205-214. doi: 10.1109/TVCG.2017.2744080. Epub 2017 Aug 29.
Whether and how does the structure of family trees differ by ancestral traits over generations? This is a fundamental question regarding the structural heterogeneity of family trees for the multi-generational transmission research. However, previous work mostly focuses on parent-child scenarios due to the lack of proper tools to handle the complexity of extending the research to multi-generational processes. Through an iterative design study with social scientists and historians, we develop TreeEvo that assists users to generate and test empirical hypotheses for multi-generational research. TreeEvo summarizes and organizes family trees by structural features in a dynamic manner based on a traditional Sankey diagram. A pixel-based technique is further proposed to compactly encode trees with complex structures in each Sankey Node. Detailed information of trees is accessible through a space-efficient visualization with semantic zooming. Moreover, TreeEvo embeds Multinomial Logit Model (MLM) to examine statistical associations between tree structure and ancestral traits. We demonstrate the effectiveness and usefulness of TreeEvo through an in-depth case-study with domain experts using a real-world dataset (containing 54,128 family trees of 126,196 individuals).
家族树的结构是否以及如何因世代相传的祖先特征而不同?这是关于多世代传递研究中家族树结构异质性的一个基本问题。然而,由于缺乏适当的工具来处理将研究扩展到多世代过程的复杂性,以前的工作主要集中在亲子场景。通过与社会科学家和历史学家的迭代设计研究,我们开发了 TreeEvo,它可以帮助用户生成和测试多世代研究的经验假设。TreeEvo 通过基于传统 Sankey 图的动态方式,根据结构特征总结和组织家族树。进一步提出了一种基于像素的技术,以在每个 Sankey 节点中紧凑地编码具有复杂结构的树。通过具有语义缩放功能的空间高效可视化,可以访问有关树的详细信息。此外,TreeEvo 嵌入了多项逻辑回归模型(MLM)来检查树结构与祖先特征之间的统计关联。我们通过使用真实数据集(包含 126196 个人的 54128 个家族树)对领域专家进行深入案例研究,展示了 TreeEvo 的有效性和实用性。