Hadoop: Recent News From Mahout


Ted Dunning


Vote on HN

Let us preface this by apologizing for poor recording quality! We had a technical issue that caused clipping on the audio. :(
We will get a transcript up soon to save your ears.

Despite that, we've got a recording from the Hadoop meetup last night in AppNexus. Ted Dunning, Chief Application Architect at MapR Technologies, will talk about recent developments in Mahout and real-time learning.

In particular, Ted covers the results from quality and speed testing of Mahout's new super-fast k-means clustering algorithms (hint, quality is very good and speed is phenomenal). He also dives deep into a design for a on-line clustering facility that can cluster the full content of the Twitter fire-hose into thousands of clusters in real-time.

Full transcript of this talk can be found here


Transcript


News from Mahout from Ted Dunning

Bio:
Ted Dunning has held Chief Scientist positions at Veoh Networks, ID Analytics and at MusicMatch, (now Yahoo Music). Ted is responsible for building the most advanced identity theft detection system on the planet, as well as one of the largest peer-assisted video distribution systems and ground-breaking music and video recommendations systems. Ted has 15 issued and 15 pending patents and contributes to several Apache open source projects including Hadoop, Zookeeper and Hbase™. He is also a committer for Apache Mahout. Ted earned a BS degree in electrical engineering from the University of Colorado; a MS degree in computer science from New Mexico State University; and a Ph.D. in computing science from Sheffield University in the United Kingdom. Ted also bought the drinks at one of the very first Hadoop User Group meetings.

Hadoop Big Data bayesian mahout mapR