Unknown author on

In this talk Adam Gibson from skymind.io presents the ND4J framework with an iScala notebook.
Combined with Spark's dataframes, this is making real data science viable in Scala. ND4J is "Numpy for Java." It works with multiple architectures (or backends) that allow for run-time-neutral scientific computing as well as chip-specific optimizations -- all while writing the same code. Algorithm developers and scientific engineers can write code for a Spark, Hadoop, or Flink cluster while keeping underlying computations that are platform-agnostic. A modern runtime for the JVM with the capability to work with GPUs lets engineers leverage the best parts of the production ecosystem without having to pick which scientific library to use.

Dean Chen Dean Chen on

Apache Spark is a next generation engine for large scale data processing built with Scala. Dean Chen, software engineer at ebay, discusses how Spark takes advantage of Scala's function idioms to produce an expressive and intuitive API for big data analysis. Dean covers the design of Spark RDDs and the abstraction enables the Spark execution engine to be extended to support a wide variety of use cases: Spark SQL, Spark Streaming, MLib and GraphX.


This was recorded at the Scala Bay meetup at PayPal.

Evan Chan Evan Chan on

In this talk, Evan Chan, Software Engineer at Ooyala, presents on real-time analytics using Cassandra, Spark & Shark at Ooyala. He offers a review of the Cassandra analytics landscape (Hadoop & HIVE), goes over custom input formats to extract data from Cassandra, and shows how Spark & Shark increase query speed and productivity over standard solutions. This talk was recorded at the DataStax Cassandra South Bay Users meetup at Ooyala.