Hadoop Security Design? Just Add Kerberos? Really?

Distributed computing is a alive and well in 2010. The Hadoop project is carrying the banner for open source distributed computing with its Hadoop Distributed File System and MapReduce engine. Hadoop is in use at many of the world's largest online media companies including Facebook, Fox Interactive Media, LinkedIn, Powerset (now part of Microsoft) and Twitter. Hadoop is entering the enterprise as evidenced by Hadoop World 2009 presentations from Booz Allen Hamilton and JP Morgan Chase. Hadoop has also been elevated to the "cloud" and made available as a service by Amazon and Sun. What the heck is it? Can it be secure? What do I do if I discover it on a network I am testing?

When Hadoop development began in 2004 no effort was expended on creating a secure distributed computing environment. In 2009 discussion about Hadoop security reached a boiling point. The developers behind Hadoop decided they needed to get some of that "security" stuff. After a thorough application of kerberos pixie dust Hadoop is now secure, or is it?

This talk will describe the types of attacks the Hadoop team attempted to prevent as well as the types of attacks the Hadoop team decided to ignore. We will determine whether Hadoop was made any more secure through the application of copious amounts of kerberos. We will complete the talk with a short discussion of how to approach a Hadoop deployment from the perspective of an penetration tester.

Presented by