We are happy to announce that Cascading 1.2 is now publicly available for download.
This release features many performance and usability enhancements while remaining backwards compatible with 1.0 and 1.1.
Specifically:
- Performance optimizations during grouping (StreamComparator)
- Composable map-side partial aggregations (AggregateBy)
- Native Riffle support for non-Cascading (or nested iterative Cascading) processes (ProcessFlow and Riffle)
For a detailed list of changes see:
CHANGES.txt
We are also happy to announce that Cascading and its extensions have their own Maven/Ivy Jar repository,Conjars. Conjars is a public repository, any developer wishing to publish Cascading libraries and extensions can register their public key and push artifacts. Conjars is a simple fork of the Clojars repo code.
Along with this release are a number of extensions created by the Cascading user community.
Among these extension are:
- Cascading.Avro – Cascading Scheme for the Apache Avro data serialization format.
- Cascading.Memcached – Integration with Memcached, Membase, and ElasticSearch.
- Bixo – a web mining toolkit
- DBMigrate – a tool for migrating data to/from RDBMSs into Hadoop
- Apache HBase, Amazon SimpleDB, and JDBC integration
- JRuby and Clojure based scripting languages for Cascading
- Cascalog – a robust interactive extensible query language
This release will run against 0.19.x, and 0.20.x. Including Amazon Elastic MapReduce.