Data Analysis and Visualization

Data Analysis and Visualization

TYPE: d3.js, Python, R, NetworkX, HTML/CSS
INVOLVEMENT: UMSI course projects
DURATION: Sep,2013 - May,2014
COLLABORATORS: See details per project
MY ROLE: Data collection, analysis, visualization


Jobble Map is a group project for course SI649-Information Visualization. In this project, we designed an interactive visualization called Jobble Map that provides occupational information in all the states of United States in 2012, based on the data from Bureau of labor statistics, Department of Labor. The system allows users to compare the occupation information between states based on job opportunity, job popularity, salary, and unemployment rate.

Jobble Map Overview


We used heatmap to visualize job data for states, and four colors represent four dimensions: green for Job Opportunity, orange for Job Popularity, blue for Salary, and red for Unemployment. Legends are provided at the lower-right corner of the map to indicate the data range of each level.

Four Color Dimensions
Color dimensions


We designed a control panel with auto-complete search and dropdown menus to look up a specific occupation. Users can further specify job requirements using the salary slider, the state and job popularity drop-down menu. They can also add states to favorite list for future references.

Control Panel
Facorite List


We also used summary chart to visualize overall job data for each state, where the x axis presents the states and the y axis presents the value of the corresponding dimension.

Summary chart


Beside the summary barchart, Jobble Map also offers four barcharts on the four job dimensions for users to compare at most five states and the national average.

Comparison chart


This is a independent project for course SI649-Information Visualization. In this project, students are asked to visualize two sets of data and reveal their correlations through interactions. I used data collected from Sleep as Android and the data I collected using Leap Motion in the game Fruit Ninja, to look into any possible relationships between my sleep and the game performance. After exploring several design variations and combining barcharts and linecharts together, I revealed some key findings from the two sets of data.

  • The time of going to bed and getting up don't necessarily have positive relations.
  • Shorter sleeps may have larger proportions of deep sleep time and higher sleep quiality.
  • Generally, my game performance is in a positive correlation with my sleep duration.

My Sleep Visualization Overview
Design variations

Do likeminded people go to similar palces?

This is a independent project for course SI601-Data Manipulations. In this project, I tried to find out if likeminded people go to similar places, which may be helpful for student club marketing their events. I collected users' data from their FourSquare "checkins" at the top 3 popular food places in Ann Arbor, and looked into their Twitter activities.

My Sleep Visualization Overview
My Sleep Visualization Overview

After I finished the first version of the project, I analyzed the food place in Ann Arbor with top3 checkins. After getting Twitter accounts of 10 users who left tips on foursquare for these restaurants, I count the word in the descriptions of them and 10 of their friends. Below are some of my findings:

My Sleep Visualization Overview
My Sleep Visualization Overview
My Sleep Visualization Overview


This is a independent project for course SI601-Data Manipulations. As a 2nd year SI student, I'm collecting data and information to help me on job searching. I'm especially curious about the relationships among occupation, salary, and states.


From the 22 barcharts generated for 22 major occupations, it is obvious that generally the states from northeast and west offer more annual salary compared to the states from south and midwest. District of Columbia offers the highest annual salary in many popular major occupations.and it is in top 3 for most of the occupations.

Barchart for each occupation


From the faceted graph, I can tell that among all states, the top-5 paid major occupations are stable, which are Management Occupations, Legal Occupations, Computer and Mathematical Occupations, Architecture and Engineering Occupations, Business and Financial Operations Occupation. However, their rankings are a little different with one state and another.

Facet barcharts for each state


It's very interesting to look at the big differences between detail occupations within the same major occupation group, both in terms of the number of detailed occupations and annual salaries. When looking at the major category of "Computer and Mathematical Occupations", I can tell that the most highly paid sub-occupation is "Mathematicians", and the second one is "Computer and Information Research Scientists". What's more interesting, when I look at the boxplot of "Legal Occupations", I was surprised to find that most of the sub-occupations are not very well paid, which is on the contrary of the my assumption. However, the "Judges, Magistrate Judges, and Magistrates" are very well paid, which increased the overall annual medium salary of the whole occupation.

Boxplot for each occupation


From the graph generated by complete linkage analysis using Euclidean distance, we can see that over 800 detailed occupations are cluttered into 7 big groups. At the bottom of the graph, we can see the most highly paid occupations across the states in blue color, most of which are management related occupations, including "Management Occupations", "Human Resources Managers" and so on, but the occupations of "Lawyers", "Software Developers", "Dentists" and other occupations for specialists are in the same group, which means they are all very well paid occupations. The not very well paid occupations are grouped in red color, which include art and labor related occupations such as "Dancers", "Farm Labor Contractors", "Actors", "Musicians and Singers" and so on.

Facet barcharts for each state