First session of the day was on Scaling Python on the Web; rough notes which I may clean up later:
- How fast is fast enough?
- Don’t prematurely optimize
- Know where the bottlenecks are, and optimize those specifically?
- Orders of magnitude: static (httpd), dynamic (python), db-queried
- Even 40 req/s in 3.4m pages/day
- Hundreds to low thousands of dynamic page views is usually good enough
- Scaling isn’t about the language, it’s about:
- DRY: cache!
- share nothing
- built a sample photo-app, FlickrKillr, for demonstration purposes
- preloaded with 100k’s users, 10-20 photos each
- first iteration: CGI
- roughly 23 requests/second
- problems:
- loading Python interpreter for each request
- all resources initialized for each request (inc. db connection)
- possible remedies:
- run a Python web server (long-running process)
- make one db connection per thread instead of request
- other remedies:
- fastcgi
- snakelets, twisted.web, RhubarbTart
- mod_python
- second iteration: python app server (CherryPy used for this demo)
- roughly 139 requests per second
- problems
- global interpreter lock — can only utilize one core on a dual core machine
- sessions in the database — prefer an in-memory session store
- remedies:
- run multiple instances of CherryPy (overcode GIL)
- but then we need to balance with something like nginx
- other options
- cherrypy in mod_python
- version 3: load balancing with nginx
- 217 requests/sec
- outstanding problems
- static files read from disk every time
- and they’re being read/written from python
- solutions:
- memcached
- combine with memcached w/ nginx
- version 4: caching
- 616 req/sec (benchmarking w/ homegrown tool)
- 1750 req/sec (benchmarking w/ ab)
- other notes:
- don’t forget to index
- without an index, the fourth iteration falls down to 28 requests/sec
http://www.polimetrix.com/pycon/demo.tar.gz
date: | 2007-02-24 12:35:29 |
---|---|
wordpress_id: | 490 |
layout: | post |
slug: | scaling-python-on-the-web |
comments: | |
category: | pycon2007, python |