SudoPizzaMe
Web Crawling NLP Information Retrieval (IR)

Vertical Search Engine

Delivered a fully functional search interface that provides researchers with up-to-date, categorized access to institutional knowledge.

Vertical Search Engine

Objectives

  • Automate the extraction of research metadata (titles, authors, publication years).
  • Implement an advanced ranking system for keyword-based querying.
  • Categorize research outputs through automated subject classification.

Solution

  • Built an automated metadata extraction pipeline that scrapes and parses RCIH publication outputs weekly.
  • Implemented TF-IDF based relevancy ranking and BM25 algorithms to improve search precision.
  • Integrated a Subject Classification module to automatically tag research papers into healthcare domains.