Alexander L. Hayes • Curriculum Vitae • hayesall.com

Curriculum Vitae

Alexander L. Hayes
hayesall@iu.edu
Indiana University Bloomington
Luddy School of Informatics, Computing, and Engineering
Myles Brand Hall West 210
918 E. 10th Street
Bloomington, IN 47401

Technical Skills

Languages: Python, Shell Scripting, Java, C/C++, JavaScript, Racket, Julia

Libraries: NumPy, SciPy, scikit-learn, Pandas, NetworkX, pytest

Tools: Git, GitHub, GitHub Actions, JIRA, ReadTheDocs, Travis-CI, CircleCI, AppVeyor, CodeCov, PyPi

Development Platforms: Linux/UNIX, Jekyll, Android, Arduino, Google Cloud Platform

Documentation Tools: LaTeX, Sphinx, Javadoc, Doxygen, Markdown, ReStructured Text

Workflows: Continuous Integration (CI), Gitflow

Education

Doctor of Philosophy (Ph.D.) Health Informatics (in progress)
Mathematics Minor
2019–present, expected graduation date: 2026
Luddy School of Informatics, Computing, and Engineering
Indiana University, Bloomington, IN

Master of Science (M.S.) Health Informatics
Class of 2023
Luddy School of Informatics, Computing, and Engineering
Indiana University, Bloomington, IN

Bachelor of Science (B.S.) Computer Science
Security Informatics Minor, Class of 2017, GPA: 3.5 Cumulative
School of Informatics, Computing, and Engineering
Indiana University, Bloomington, IN

Experience

Indiana University, Bloomington
Luddy School of Informatics, Computer Science, and Engineering

Instructor of Record — (August 2024 — Present)
- Fall 2024 – Information Infrastructure II – INFO-I 211
  Taught from the book Erika and I wrote over the summer, adapting it from an asynchronous online course to a synchronous in-person course.
Associate Instructor — (January 2023 — July 2024)
- Summer 2024 – Information Infrastructure II – INFO-I 211 (asynchronous, online)
  Supervisor: Erika Lee
  Piloted GitHub Actions to finish putting the “auto” in autograding, and wrote the first draft of a book with Erika Lee for teaching an asynchronous web app development course.
- Spring 2024 – Information Infrastructure II – INFO-I 211
  Supervisor: Matt Hottell
  Large cohort (200+ students) stress-tested previous workflows. Worked on autograding infrastructure: enough to orchestrate copies of GitHub repos, automatically test them, or bootstrap a Flask application.
- Fall 2023 – Information Infrastructure II – INFO-I 211
  Supervisor: Matt Hottell
  Took a role as a course manager, writer, and grading. The key objective was to minimize the number of people touching the gradebook.
- Summer 2023 – Information Infrastructure II – INFO-I 211
  Supervisor: Matt Hottell
  Moved all course assignments into GitHub repositories, and introduced “unit testing” to motivate how we know when programs meet expectations.
- Summer 2023 – Information Infrastructure I – INFO-I 210
  Supervisor: Shabnam Kavousian
  Worked with Shabnam Kavousian to develop and teach an alternate “intro Python programming” curriculum with permission from the department chair and director of undergraduate studies. Collected data and wrote an internal pilot study, finding that the students who took the alternate I210 with Shabnam and I tended to outperform their peers in future classes.
- Spring 2023 – Information Infrastructure II – INFO-I 211
  Supervisor: Matt Hottell and Erika Lee
  Led two weekly lab sessions, hosted evening help sessions, guided students during lectures, and learned how the course worked.
Research Assistant — Computer Vision Lab (January 2022 — January 2023)
- Investigated explainability in time series problems
- Implemented a Bayesian network (BN) explainability technique as a Python package targeting the pomegranate library
- Extended the technique toward handling time series problems where the sequence is represented by a dynamic Bayesian network (DBN)
Graduate Mentor - Research Experience for Undergraduates — ProHealth Lab (May 2022 — July 2022)
- Mentored for a project analyzing smartwatch data alongside clinical data
- Extended prior work for infrastructure development in the Hoosier Moms Cohort
- Wrote course material for exploratory data analysis, scientific programming, and git
Research Assistant – ProHealth Lab – Precision Health Initiative (January 2019 — December 2021)
- Secondary analysis on incidence of gestational diabetes
  - Developed tools for data cleaning and pre-processing for creating reproducible data partitions: numom2b.org.
  - Solved the binary class imbalance problem (imbalance of 1 to 32).
  - Reduced features (original feature space ~7000 variables)
  - Explained predictions for a clinical decision support setting.
- Infrastructure development for Hoosier Moms Cohort
  - Implemented caching to work with snapshots of the database
  - Decreased analysis time from >72 hours to <5 minutes
  - Prototyped a dashboard for exploratory visualization (hmc-dashboard)

CareBand Inc.
222 West Merchandise Mart Plaza #1230, Chicago, IL

Developer and Machine Learning Research Consultant (February 2020 — August 2020)
- Implemented solutions for indoor location tracking.
- Developed models to analyze trends in user behavior.

The University of Texas at Dallas
Department of Computer Science, Richardson, TX

Teaching Assistant (August 2018 — December 2018)
- Fall 2018 – Automata Theory – CS 4384.001
  Led two lectures on finite automata minimization. Graded assignments and exams, prepared and verified automata examples prior to lectures, and held four hours of office hours per week to answer questions.
Research Assistant – StARLinG Lab (May 2018 — August 2018)
- Extended the lab’s open source tool for converting raw text into relational facts. Rewrote the software so it could be used as a command-line tool or as an imported Python package. Released the software as rnlp.
- Documented, unit tested, and ensured correctness of a Python port of Relational Functional Gradient Boosting (rfgb).
Teaching Assistant (August 2017 — May 2018)
- Spring 2018 – C Programming in a UNIX Environment – CS 3377.501
  Provided feedback on C++ programming assignments and bash scripts in terms of documentation, style, and functionality of code.
- Fall 2017 – Automata Theory – CS 4384.001
  Graded assignments and exams, prepared and verified automata examples prior to lectures, and provided additional support to students outside of class.

Indiana University, Bloomington
Department of Informatics and Computer Science

Undergraduate Researcher, STARAI Lab (July 2017 — July 2018)
- Explored methods combining Natural Language Processing and Statistical Relational Learning for information extraction on SEC Form S-1 Documents.
- Facilitated the public release of the lab’s source code onto GitHub, distributed as BoostSRL. Maintained the BoostSRL wiki and tutorials.
Undergraduate Researcher, ProHealth Research Experience for Undergraduates (May 2016 — August 2016)
- Built on research which previously inferred adverse side-effects of drugs based on text data mined from the web. Our work focused on predicting drug-drug interactions from data mined from OpenFDA, PubMed, and a variety of Blogs.
Camp Counselor, SICE Summer Camp (2014, 2015, 2016, 2017)
- Led sessions on intermediate Python programming, Scratch, Raspberry Pi, information security, and data analytics.
- Introduced high school students to Indiana University’s campus, navigated them between sessions where they learned about computer science and informatics.

Software

srlearn: A Python Library for Gradient-Boosted Statistical Relational Models
SRLBoost: A Java library for learning and inference with SRL models: up to 15x faster than existing libraries
- Source on GitHub
relational-datasets: Python/Julia libraries for working with benchmark datasets for statistical relational learning
- Source on GitHub
- Documentation
rnlp: Converting text to relational facts

Publications and Poster Presentations

Alexander L. Hayes, Lucas Newman-Johnson, David Crandall. 2022. Dynamic Bayesian Rule Learning for Interpretable Time Series Prediction. Poster Presentation. April 29, 2022. Innovation Hall, IUPUI, Indianapolis, IN, USA. —
Athresh Karanam, Alexander L. Hayes, Harsha Kokel, David M. Haas, Predrag Radivojac, and Sriraam Natarajan. 2021. A Probabilistic Approach to Extract Qualitative Knowledge for Early Prediction of Gestational Diabetes. Nineteenth International Conference on Artificial Intelligence in Medicine. June 15-18, 2021. Online (Hosted in Porto, Portugal). https://doi.org/10.1007/978-3-030-77211-6_59 —
Alexander L. Hayes. 2020. srlearn: A Python Library for Gradient-Boosted Statistical Relational Models. Ninth International Workshop on Statistical Relational AI. February 7, 2020. New York City, NY, USA. — — —
Alexander L. Hayes. 2019. srlearn: A Python Library for Gradient-Boosted Statistical Relational Models. HCI Fest Poster Presentation. December 5, 2019. Bloomington, IN, USA. — —
Alexander L. Hayes, Mayukh Das, Phillip Odom, and Sriraam Natarajan. 2017. User Friendly Automatic Construction of Background Knowledge: Mode Construction from ER Diagrams. Knowledge Capture Conference (K-CAP '17). December 4-6, 2017. Austin, TX, USA. https://doi.org/10.1145/3148011.3148027 — —
Alexander Hayes, Savannah Smith, Ciabhan Connelly, Devendra Dhami, and Sriraam Natarajan. 2016. Predicting Drug-Drug Interactions: Combining Machine Learning and Natural Language Processing. ProHealth REU, School of Informatics and Computing, Indiana University. July 2016. Bloomington, IN, USA. — —
Aaron Porter and Alexander Hayes. 2016. Stress-Induced Video Capture: Forensic Capture for People with Visual Impairments. School of Informatics and Computer Science Spring Research Symposium. April 2016. Bloomington, IN, USA.

Conference Attendance

The 2024 Decoding the Disciplines Conference: Adapting Decoding for the Next Generation 2024: Indiana University School of Education, Bloomington, Indiana. (2024-10-31, 2024-11-02) Conference URL
International Conference on Artificial Intelligence in Medicine (AIME) 2021: Online, Hosted in Porto, Portugal. Spotlight Paper Presentation. (2021-06-15, 2021-06-18) Conference URL
Association for the Advancement of Artificial Intelligence (AAAI) 2020: Hilton New York Midtown, New York, New York, USA. Workshop Poster Presentation. (2020-02-06, 2020-02-08)
- Ninth International Workshop on Statistical Relational AI (StarAI 2020) Workshop URL
International Conference of Machine Learning (ICML) 2019: Long Beach Convention Center, Long Beach, California, USA. Attendee. (2019-06-14, 2019-06-15)
- 2019 Workshop on Human-in-the-Loop Learning (HILL) Workshop URL, ICML Schedule
- The Third Workshop on Tractable Probabilistic Modeling (TPM) Workshop URL, ICML Schedule

Service - Open Source Contributions

`scikit-learn-contrib / imbalanced-learn`

imbalanced-learn “A Python package to Tackle the Curse of Imbalanced Datasets in Machine Learning”

Changes proposed:

Code review:

Community questions I helped resolve:

`SPFlow / SPFlow`

SPFlow “An easy and extensible library for sum-product networks.”

Changes proposed:

Community questions I helped resolved:

How can I reproduce the example visualization?

`microsoft / LightGBM`

LightGBM “A fast, distributed, high performance gradient boosting (GBT, GBDT, GBRT, GBM or MART) framework based on decision tree algorithms, used for ranking, classification and many other machine learning tasks.”

Changes proposed:

Code review:

Limiting files checked during documentation generation

Community questions I helped resolve:

`google-research / arxiv-latex-cleaner`

arxiv-latex-cleaner: “arXiv LaTeX Cleaner: Easily clean the LaTeX code of your paper to submit to arXiv”

Changes proposed:

Converting to a Python package and providing a console script