Overview
FanFiction Search Engine: A full-text search engine for Code Lyoko FanFiction. Final project for Search Informatics.
- Source Code: https://github.com/hayesall/CLFanfictionSearchEngine
Motivation
The final portion of the “Search Informatics” course I took involved developing an entire search engine—including a frontend website, a backend for returning pages, the spiders for crawing the web, and scrapers for turning pages into searchable text.
At the time, FanFiction.Net lacked a full-text search option. It was possible to search for content in the “title” or “synopsis” sections, but there was no way to know if any of the stories mentioned “twitter.”
![Screenshot of the search engine page, containing a text box where a user can write their query, a button to start the search, and some disclaimers about the project: specifically that this was done with regards to the terms and conditions of the host website.](/images/software/cl-fanfiction-search-engine/ff_search_screenshot.png)
The PageRank algorithm operates on a directed graph, so it was possible to incorporate some knowledge about how the website worked. There are “Users” and “Stories,” which I generally drew with violet or blue nodes. Users can write or review stories—therefore I could crawl over all the stories, extract text to build the search engine, and record user interactions to build the network.
![Simple sixteen node network of users and stories, where directed edges show that a user wrote or reviewed a story.](https://raw.githubusercontent.com/hayesall/CLFanfictionSearchEngine/master/media/directed-fanfiction-graph.jpg)
When visualized, the inner region appeared to be extremely dense. There were some users and stories with less attention, but the inner region was quite dense and showed a high amount of engagement between users and stories.
![Complete network of the community, showing a dense inner region where many users and stories interact with one another.](https://github.com/hayesall/CLFanfictionSearchEngine/blob/master/media/fan-network9.png?raw=true)
Conclusions
I wrote some notes about this as a blog post: Network of Code Lyoko FanFiction.
This search engine is no longer maintained. I still study networks and learning on arbitrarily-structured data though.