Archicise
Exercise

Design a Search Engine

Build a web-scale search engine that crawls, indexes, and retrieves relevant results for user queries.

Functional Requirements

  • Crawl and index billions of web pages
  • Return relevant results ranked by importance
  • Support full-text search with Boolean operators
  • Autocomplete and spelling suggestions
  • Personalized results based on user history

Non-Functional Requirements

  • Query response time under 200ms
  • Fresh index (pages indexed within hours)
  • Handle 10K+ queries per second
  • High relevance and quality results

Questions to Consider

  • How will you rank pages for relevance?
  • What data structures enable fast full-text search?
  • How do you keep the index fresh with billions of pages?
Your Solution

Web Crawler

Design the distributed web crawler. Consider URL frontier, politeness policies, duplicate detection, and handling JavaScript-rendered pages.