VU BreakThru

Home » News » Introducing Data Science Visions

Introducing Data Science Visions

Posted by on Wednesday, January 3, 2018 in News, TIPs 2017.


Associate Professor Andreas Berlind

Written by Associate Professor Andreas Berlind

Zeros and ones have become the new precious commodity in the current information age. It’s difficult to exaggerate the impact of data on government, business, society and academia. Organizations collect massive amounts of data and use it to inform all aspects of operations and decision making. For example, every organization with a website collects web traffic data about visitors to its website. What they do with those data varies. The more data-sophisticated organizations deploy teams of experts to mine for patterns and conduct experiments, tailoring web content and tracking response to achieve some set of objectives. Less sophisticated organizations don’t yet have the expertise in place to do this, but are quickly moving in that direction.

In academia, many fields are being revolutionized by large data sets. Biomedical fields now utilize electronic medical records, genetic data and medical imaging data. In engineering, we have data on infrastructure systems (dams, power grids, etc.), while the social sciences have census and polling data. In education, we possess data from public school systems and user clicks on online courses. In my field of astrophysics, the most groundbreaking discoveries of the past decade, such as the nature of the accelerating universe or how commonplace planets are outside our solar system, have come from mining large sets of data collected from telescopes. All of these fields have different data with unique challenges and different goals. However, the fundamental technical skills needed to analyze the data and extract knowledge are shared in common: data organization, computer programming, statistics, machine learning, data visualization, etc. The emerging field that combines these elements is called data science, and it is rapidly becoming a focus area in higher education.


The first Data Science Visions think tank focused on data curated by TERA.

This year, the Data Science Visions TIPs program is laying the groundwork for a cohesive and sustained effort in data science at Vanderbilt. Our goals are to build community, spark new research collaborations, establish resources (such as supporting data collection and curation and setting up training opportunities) for those who need them and elevate the conversation (at the university) about the role of data in society. We also wish to support new educational tracks in data science, which will likely emerge in the coming years, by developing connections to industry and other potential employers for our students and by providing a menu of immersive research experiences.


We recently started a series of data think tanks, each of which features a specific type of data. The objective is to put people with interesting data in the same room as people with data science expertise so that new collaborations may ensue.


We have hosted two data think tanks so far. The first one featured data curated by the Tennessee Education Research Alliance (TERA) – a partnership between Vanderbilt’s Peabody School of Education and the State of Tennessee Department of Education. TERA houses an amazing dataset covering the public schools, teachers and students in Tennessee, dating back to 2005. The data include student demographics and socioeconomic status, enrollment, attendance, discipline, performance on standardized tests and grades for all elementary and secondary school students in the state, as well as students in public pre-K. Data on school staff include information related to teacher education, licensure, employment history, work assignments, compensation and performance evaluations. Our TERA think tank was attended by TERA leaders and graduate students from Peabody, as well as postdocs and faculty from Astrophysics, Bioinformatics, Biostatistics and Computer Science. The TERA experts described their research questions (e.g., how can we improve early childhood reading skills?) and gave a tour of the data: what’s available, how it’s stored, what’s missing, etc. At the end of the session, we began brainstorming ideas of possible projects that could apply data science methods (e.g., machine learning) to data. We are now identifying one or two concrete projects for continued collaboration.

Data Science Think Tank_peru

The second Data Science Visions think tank focused on data curated by Associate Professor Steven Wernke and his graduate students.

Our second think tank featured satellite imaging data of Inca settlements in Peru. Associate Professor of Anthropology Steven Wernke and his graduate students showed us images of pre-Hispanic fortifications, settlements and surrounding agricultural terracing that they have carefully identified and characterized on-site over the past several years. The session was attended by graduate students, postdocs and faculty from Anthropology, Astrophysics, Bioinformatics, Chemistry, Computer Science and Statistics, and we brainstormed how to use machine learning, computer vision and even game theory to automate the process of discovering and characterizing new terrain. The think tank has already sparked a new collaboration focused on discovering new hilltop fortifications from 1100-1450.

Be sure to return to the blog regularly for updates on these and other activities related to our TIPs program. For more information, follow us on Twitter @VUDataScience.

Leave a Reply

Back Home   

Recent Posts

Browse by Month