h1

Evaluating MapReduce for Multi-core and Multiprocessor Systems

January 2, 2008

This paper evaluates the suitability of the MapReduce model for multi-core and multi-processor systems. MapReduce was created by Google for application development on data-centers with thousands of servers. It allows programmers to write functional-style code that is automatically parallelized and scheduled in a distributed system.

We describe Phoenix, an implementation of MapReduce for shared-memory systems that includes a programming API and an efficient runtime system. The Phoenix runtime automatically manages thread creation, dynamic task scheduling, data partitioning, and fault tolerance across processor nodes. We study Phoenix with…

link to video

h1

PHP on Hormones

January 2, 2008

“PHP on Hormones” – Rasmus Lerdorf, Yahoo! In order to build a successful modern web application you have to engage the user at a very primal level and harness each and every user in such a way that every action enhances the overall experience for all users. Translating this concept into code and making sure that an exponentially growing user base doesn’t bring your application to its knees becomes a very interesting problem to solve.

link to video

h1

Pasha Sadri, on Yahoo Pipes

January 1, 2008

“Pipes” – Pasha Sadri, Principal Software Engineer, Advanced Development Division, Yahoo! Pipes is a service platform for processing well-structured data such as RSS, Atom, and RDF feeds in a web-based visual programming environment. Developers can use Pipes to combine data sources and user input into mashups without having to write code. These mashups, analagous in some ways to Unix pipes, can power badges on personal publishing sites, provide core functionality for web applications, or serve as reusable components within the Pipes platform itself

link to video

h1

Kaj Arnö leads this entertaining discussion with the brightest and best database architecture minds.

January 1, 2008

Kaj Arnö leads this entertaining discussion with the brightest and best database architecture minds. Each ego highlights their vision of database architecture now and in the future, and how the focus of their development and research has guided each of their contributions to the state of database system architecture.

link to video

h1

Marten Mickos at MySQL Conference

January 1, 2008

“Welcome and State of MySQL AB” – Marten Mickos, CEO of MySQL AB, gives an update on how MySQL AB is doing, including a view behind the scenes of the business, the vision, the phenomenal growth, and some of the plans for the future.

Link to video

h1

Video: Scaling MySQL at YouTube

January 1, 2008
h1

Scaling MySQL at YouTube

January 1, 2008

Great podcast thanks to IT Conversations

In mid 2006, YouTube served approximately 100 million videos in a single day. To maintain a website of that scale, one would imagine YouTube has hundreds of DBAs. But in fact, there are just three people that make it all work. Paul Tuckfield, the MySQL DBA at YouTube shares horror stories about scalability at YouTube and how he coped with them to keep the show going everyday, while learning important lessons along the way.

YouTube uses MySQL as the back-end. When Paul joined YouTube, he had 15 years of experience solving database scalability problems and administering computer networks. However, he was completely new to MySQL. Within weeks, the set of challenges he faced about scaling MySQL taught him so much more than one could learn over years. He’s all excited about sharing his insights.

According to him, the three important reasons for YouTube’s scalability are Python, Memcache and MySQL replication, the last having the most impact. Most people think that the answer to scalability is in upgrading hardware and CPU power. Adding CPUs doesn’t work on its own; wisdom is in getting the maximum amount of RAM for the CPU and then fine tuning. He talks about replication in detail sharing his experience in dealing with problems such as time lags in replication between master and slave disks, RAID caching, OS level caching on Linux and cache at the database.

Link to podcast

Update:

Here’s a link to the video.

h1

Yahoo on Hadoop

December 26, 2007

Here’s a great podcast about Yahoo’s involvement in Hadoop.