eBay NYC: How To Use Scala on Hadoop by Adam Ilardi


Adam Ilardi


Vote on HN

This talk is by Adam Ilardi, a data scientist at eBay, and was recorded at the NY Scala meetup at eBay NYC. Adam talks about eBay's transition from Pig and raw Cascading to Scalding and explains other ways they use Scala.

 

 


Scala and Hadoop @ eBay from ebaynyc

Bio: Adam is an Applied Researcher/Data Scientist who is currently employed in eBay's NYC office. Adam has thus far had a stellar career, having worked as a Technology Analyst for Goldmann Sachs, a Software Engineer at CSC and a tech consultant for a range of companies such as Sun Microsystems. Dubbed as "one of the most hard working gifted people I know" by his associates, Adam clearly has proven his worth as an outstanding data scientist.

Want to hear from more top engineers?

Our weekly email contains the best software development content and interviews with top CTOs. Enter your email address now to stay in the loop.

Background on Scala

Scala is a relatively new functional programming language that has a number of features of an object oriented language such as the ability to create object hierarchy which allows for maximum code reuse and extensibility. The term Scala is actually an acronym for "Scalable Language." People who use Scala rave about its elegance and efficiency. Scala is unique in that it is direct enough to write short scripts with this language, while at the same time be able to use it for high volume and high scale systems.  A part of that power comes from Scala actually being a full fledged Object Oriented language.

Scala treats every value as an object. And every operation on a value is treated as a call to a method. Scala supports many common object orieneted design patterns. But Scala is also a functional language. Just think of objects as functions and functions as objects.

 

Background on Hadoop

Part of Apache, Hadoop is an open source software that enables reliable, scalable distributed computing. Hadoop is a framework that allows working with and processing large data sets across clusters of computers at very high scale. The beauty of Hadoop is that it can use one or many computers and use their data storage and computation capacity, which obviously makes it a very powerful framework.

[caption id="attachment_787" align="aligncenter" width="629"]Photo from the event Photo from the event[/caption]

More on Scala and Hadoop:
Moving beyond object-oriented programming with Scala

Hadoop: Recent News From Mahout

Thumbtack: NoSQL Database Comparison by Ben Engber

Keep in touch

If you liked this talk, you can get much more similar tech talks and tech articles by subscribing to our newsletter. The newsletter is sent each week, and we bring you some of the best latest articles, videos, and even jobs in opensource technology. You can also stay in touch by following our founder Pete Soderling on Twitter @petesoder.

Want to hear from more top engineers?

Our weekly email contains the best software development content and interviews with top CTOs. Enter your email address now to stay in the loop.

Hadoop R Big Data Lang scala data science data analytics