top of page

Topological Data Analysis

Aarav Shah '28

With advancements in coding languages and a growing impact for the broader fields of science and engineering, topological data analysis (TDA) certainly sparks interest among the STEM-passionate community (Smith, Dłotko, & Zavala, 2021). TDA revolves around the development and advancements in topology, a specific sector of science focusing on the orientation and depth of land regions (Talebi, 2022). Leonhard Euler, a renowned mathematician, helped initiate the growing scholarship in topology, namely with his work on the city of Königsberg. Euler constructed a dot-line graph of Königsberg, simplifying information and data collected from the physical location (Talebi, 2022). Today, topological data analysis aligns with the same mission of collecting and simplifying data into more accessible presentations, with applications extending to medical projects and S&P 500 statistics.

For starters, topological data analysis is a form of data science that extracts and studies the shape of gathered data. TDA consists of four main steps: data collection, shape generation, characterization of shape’s features, and statistical analysis (Talebi, 2022). Each of these four steps are interconnected, with one depending on the completion of the previous step to ensure the effectiveness of the results. For example, without proper shape generation, characterizing the shape’s features could present incorrect information.

The Mapper Algorithm proves to be useful for high-dimensional data sets involving “exploratory analysis” (Talebi, 2022). For example, TDA was used to analyze S&P 500 financial-based data. To achieve this goal, the coding programmers employed a five-step process: data collection, data rescaling and simplification, cover defining, clustering of the data, and the formulation of a representative graph (Talebi, 2022). Clustering defines the mapping of newly
colored data back to the original data sets. After implementing TDA onto the S&P 500 data, the programmers were able to gain a clearer illustration of the data sets through the TDA’s color-coding and clusters (Talebi, 2022).

When analyzing data sets, the biggest problem with traditional approaches is the unintended presence of noise that strays away from the data’s main message (Talebi, 2022). However, TDA is quite effective at reducing the negative impacts of noise on the physical interpretations and analysis of the data (Talebi, 2022). Additionally, TDA excels in handling data sets with multiple dimensions, as it effectively reduces the dimensions to a more simplified scale (Talebi, 2022).

From a medical perspective, TDA has been used for countless different purposes. For one, TDA has been used to examine brain dendrograms—tree diagrams—and brain networks for children with attention-deficit/hyperactivity disorder (ADHD) and autism (Smith et al., 2021). Additionally, from an engineering perspective, TDA has been a useful tool in studying the structure of glasses, with the ability to detect complex features such as voids and rings in silica and metallic glasses (Smith et al., 2021).

Despite all of these applications, there are drawbacks and areas for improvement in TDA, including refining the accuracy of the respective statistical and calculation-based functions (Chung & Ombao, 2021). Additionally, making the mathematical proofs—those behind all of the calculations and process behind TDA—accessible to a younger audience will ensure the continuing usage and exploration into TDA.


References

Chung, M. K., & Ombao, H. (2021). Discussion of 'Event history and topological data analysis'. Biometrika, 108(4), 775–778. https://doi.org/10.1093/biomet/asab023
Talebi, Shaw. (2022, May 21). Topological Data Analysis (TDA). Medium. Retrieved from https://medium.com/data-science/topological-data-analysis-tda-b7f9b770c951 Talebi, Shaw. (2022, June 3). The Mapper Algorithm. Medium. Retrieved from https://medium.datadriveninvestor.com/the-mapper-algorithm-d0842f926658 Talebi, Shaw. (2022, June 16). Persistent Homology. Medium. Retrieved from https://medium.com/datadriveninvestor/persistent-homology-f22789d753c4 Smith, A. D., Dłotko, P., & Zavala, V. M. (2021, March). Topological data analysis: Concepts, computation, and applications in chemical engineering. Computers & Chemical Engineering, 146. Retrieved from
https://www.sciencedirect.com/science/article/abs/pii/S009813542031245X

Project Name

This is your Project description. Provide a brief summary to help visitors understand the context and background of your work. Click on "Edit Text" or double click on the text box to start.

©2021 by Lawrenceville Science Reports. Proudly created with Wix.com

bottom of page