NorthAmerica_divisions

Majors Project – ChordDiagram

 Ahsan Ali Khoja & Shana Weissman

We started this project with the question that whether the choice of majors was random, or affected by a person’s place of origin? This question was of interest as this might lead us to a conclusion as to whether one’s place of origin has any impact on a choice of major, which is something that the admissions department can use while looking at countries to recruit from, and also by the marketing department when targeting particular countries to market their programs in.

The project started with collecting anonymous data from the registrar’s office, followed by some python scripting for cleaning the data, and changing it into a reasonable format, such that it can be read as a matrix by R. We also used PSQL database to store the data. Python library “psycopg2” was used to access data from the database, and convert it into matrix form.

A matrix of dimensions M x N was created, where M represented the Countries, and N represented the Major. The data was further cleaned using python to form m x n matrix for continents and divisions. A csv file with such matrix was constructed to provide to R, which was then converted into Chord Diagram by the “Circlize” library.

We are still working on the project, with our focus on making the visualization better and aesthetically pleasing.