Scaling Python on the Web

First session of the day was on Scaling Python on the Web; rough notes which I may clean up later:

  • How fast is fast enough?
    • Don’t prematurely optimize
    • Know where the bottlenecks are, and optimize those specifically?
  • Orders of magnitude: static (httpd), dynamic (python), db-queried
  • Even 40 req/s in 3.4m pages/day
  • Hundreds to low thousands of dynamic page views is usually good enough
  • Scaling isn’t about the language, it’s about:
    • DRY: cache!
    • share nothing
  • built a sample photo-app, FlickrKillr, for demonstration purposes
    • preloaded with 100k’s users, 10-20 photos each
  • first iteration: CGI
    • roughly 23 requests/second
    • problems:
      • loading Python interpreter for each request
      • all resources initialized for each request (inc. db connection)
    • possible remedies:
      • run a Python web server (long-running process)
      • make one db connection per thread instead of request
    • other remedies:
      • fastcgi
      • snakelets, twisted.web, RhubarbTart
      • mod_python
  • second iteration: python app server (CherryPy used for this demo)
    • roughly 139 requests per second
    • problems
      • global interpreter lock — can only utilize one core on a dual core machine
      • sessions in the database — prefer an in-memory session store
    • remedies:
      • run multiple instances of CherryPy (overcode GIL)
      • but then we need to balance with something like nginx
    • other options
      • cherrypy in mod_python
  • version 3: load balancing with nginx
    • 217 requests/sec
    • outstanding problems
      • static files read from disk every time
      • and they’re being read/written from python
    • solutions:
      • memcached
      • combine with memcached w/ nginx
  • version 4: caching
    • 616 req/sec (benchmarking w/ homegrown tool)
    • 1750 req/sec (benchmarking w/ ab)
  • other notes:
    • don’t forget to index
    • without an index, the fourth iteration falls down to 28 requests/sec

date:2007-02-24 12:35:29
category:pycon2007, python