This guest post is a part of a short series about Guy Yachdav, Tatyana Goldberg and Christian Dallago and the journey that was inspired by their participation as Google Summer of Code mentors for the BioJS project. Donâ€™t miss the first post in the series. Heads up, this post contains spoilers for Game of Thrones seasons 5 and 6!
We built on the Google Summer of Code
at the Technical University of Munich
We began with two dozen students who worked on expanding the BioJS visualization library. Our class became popular quickly and the number of applicants doubled each semester (nearly 180 applicants for 40 seats in the 2016 summer term).
In 2016 our team grew to include Christian Dallago
Our aim was to create an online portal for Game of Thrones fans
- Provide the most comprehensive, structured and open data set about the Game of Thrones world accessible via API.
- Listen to what people are saying on Twitter about each of the showâ€™s characters.
- Use machine learning algorithms to predict the likelihood of each characterâ€™s death.
Our plan worked â€” the students were engaged. It was a beautiful sight to see: GitHub repos humming with activity as each dev team delved deeper into their projects. As a project manager, you know youâ€™ve got something good when issues are being opened and closed at 4:00 AM!
The results were mind blowing. In 50 days of programming, 36 students opened over 1,200 issues and pull requests, pushed 3,300 commits, released four apps to NPM, and, of course, produced one absolutely amazing website
The website amasses data from 2,028 characters. Our map shows 240 landmarks and the paths traveled by 28 characters. Our Twitter sentiment analysis tool analyzed over 3 million tweets. And we launched the first ever machine learning-based prediction algorithm
that predicts the likelihood of dying for the 1,451 characters in the show that are still alive.Visualization of Twitter sentiment analysis data for Jon Snow during season 5 of Game of Thrones.
The X axis shows the timeline and the Y axis shows the number of positive (green) and negative (red) tweets. Each tweet is analyzed by an algorithm using a neural network to determine whether the tweetâ€™s writer has a positive, negative or neutral attitude toward the character. Since launch, the siteâ€™s popularity has skyrocketed. Following our press release
, we were covered by over 1,500 media outlets, most notably Time
, The Guardian
, Rolling Stone
, Daily Mail
, The Telegraph
and many more. HowStuffWorks
, The Vulture
and others produced videos about the site and Chris Hardwickâ€™s Comedy Central show
did a segment about us. We've also given countless interviews to TV
, radio and newspapers.Google Analytics for the website.
Left chart shows the number of visitors to the website during the first week after launch, reaching over 73K visitors on April 25th. Right chart shows the number of visitors at a given time point during the same week.The most exciting part of the project was predicting the likelihood that any given character would die using machine learning
. Machine learning algorithms find rules and patterns in the data, things that humans cannot obviously and simply detect. Once the rules and patterns are identified, we apply machine learning to make inferences or predictions from novel, previously unseen, data sets.Warning: The next paragraphs contain spoilers for seasons 5 and 6 of Game of Thrones!
In order to predict the likelihood of a characterâ€™s death
, we collected information about all of the characters that appeared in books 1 to 5 and analyzed over 30 features, including age, gender, marital status and others. Then we used a support vector machine
(SVM) to statistically compare the features of characters, both dead and alive, to predict who would get the axe next. Our prediction was correct for 74% of all cases and surprised us by placing a number of characters thought to be relatively safe in grave danger.
According to our predictions, Jon Snow, who was seemingly betrayed and murdered by fellow members of the Nightâ€™s Watch at the end of season 5, had only an 11% chance of dying. Indeed, Jon has risen from the dead in the second episode of season 6! We also predicted that the rulers of Dorn (Doran and Trystane) Martell are at a high likelihood of death and, as predicted, they were taken out in the first episode of the new season.
Of course, as is always the case with predictions, there were also misses. We didnâ€™t expect Roose Bolton to be killed off nor did we see Hodorâ€™s departure coming.
This experience was an amazing ride for our team and it all started with Google Summer of Code! In the next post weâ€™ll share what followed and where we see ourselves heading in the future.By Guy Yachdav, Tatyana Goldberg and Christian Dallago, BioJS