Overview
FanFiction Search Engine: A full-text search engine for Code Lyoko FanFiction. Final project for Search Informatics.
- Source Code: https://github.com/hayesall/CLFanfictionSearchEngine
Motivation
The final portion of the “Search Informatics” course I took involved developing an entire search engine—including a frontend website, a backend for returning pages, the spiders for crawing the web, and scrapers for turning pages into searchable text.
At the time, FanFiction.Net lacked a full-text search option. It was possible to search for content in the “title” or “synopsis” sections, but there was no way to know if any of the stories mentioned “twitter.”
data:image/s3,"s3://crabby-images/929eb/929ebca0988799fb357926a9b8b9ba3080ee74f9" alt="Screenshot of the search engine page, containing a text box where a user can write their query, a button to start the search, and some disclaimers about the project: specifically that this was done with regards to the terms and conditions of the host website."
The PageRank algorithm operates on a directed graph, so it was possible to incorporate some knowledge about how the website worked. There are “Users” and “Stories,” which I generally drew with violet or blue nodes. Users can write or review stories—therefore I could crawl over all the stories, extract text to build the search engine, and record user interactions to build the network.
data:image/s3,"s3://crabby-images/2cb1a/2cb1a3c31bca5c2d3f2a3708d418a07c1f599ff9" alt="Simple sixteen node network of users and stories, where directed edges show that a user wrote or reviewed a story."
When visualized, the inner region appeared to be extremely dense. There were some users and stories with less attention, but the inner region was quite dense and showed a high amount of engagement between users and stories.
data:image/s3,"s3://crabby-images/dee22/dee22ef8f7627b43f87dd7f9a635ff28c8c07612" alt="Complete network of the community, showing a dense inner region where many users and stories interact with one another."
Conclusions
I wrote some notes about this as a blog post: Network of Code Lyoko FanFiction.
This search engine is no longer maintained. I still study networks and learning on arbitrarily-structured data though.