match coverage] * comments [the blogging age, remember?] * live streaming [ustream does not exist, yet] * live coverage of events [cover it live does not exist, yet] * user-centric design [personalization, ratings] * even more videos [I can haz more LOLCats]
150% traffic growth but [6 months planning ahead] $ video costs [bandwidth cost: 1€/GB] * comments costs [DB writes, CPU, disk i/o] * live streaming costs [bandwidth cost: 1€/GB] * limited iron resources, not happy with our current host [dedicated managed servers in top GR Datacenter]
VS 1€/GB] * made videos section possible... * ...and advertisers loved it ($$$+) * first GR site to focus on video, key competitive advantage * 6TB video traffic in the first month * hired a video editing team to support the demand
the main website [Windows 2003, IIS 6] * 2x(n) Application servers for APIs [Windows 2003, IIS 6] * 2x(n) Servers for banner managers [CentOS, Apache, OpenX] * 1x Storage server * 2x Database servers [MS SQL Server 2008 with failover] * 2x Reverse Proxy cache servers [Squid] * 2x Load Balancers [HAProxy with failover] * 1x monitoring server [munin with a lot of custom plugins]
grows over 60% for 2 minutes, add another application server * if average CPU usage falls below 30% for 5 minutes, kill gracefully an application server * 20 instances on peaks * 3 instances (minimum) on normal operations * no more “Server is busy” errors * pay only what you (really) need * you can now sleep at nights * 60% overall cost reduction
CloudFront * use multiple CNAMES with CloudFront to boost HTTP requests [as per YSlow recommends] * CloudFront custom domains are sexy * robust DNS with Route 53 * simple monitoring with CloudWatch [you still need an external monitoring tool]
instances saves you $$ * EC2 is a hacker playground [prepare for DOS attacks] * backup entire AMIs to S3 [instances *WILL* #FAIL] * EBS disk I/O is slow, but amazon is working on this [problems with DB writes] * spawning new instances is slow [15 mins provisioning can be a show stopper on scaling] * S3 uploads/downloads are slow * sticky session is a must [we replaced AWS ELB with HAProxy just for this] * SLAs can't guarantee high availability [AWS *WILL* #FAIL]
[interested? I’m hiring] * automate everything [makes you sleep at night] * monitor everything [munin is your friend] * disaster prevention [work *ALWAYS* around the worst case scenario] * windows server administration is a mess [and AWS is not making this prettier] * DB scale is the hardest part [code changes] * legacy software *IS* a problem ** on scaling ** on hiring ** on growing (have you tried to use XMPP via ASP?)
compared to CloudFront [especially in Greece] * not affordable for large architectures [if you’re running 300+ instances, you should consider making your own datacenter]