All posts by admin

Cascading @ ACM Data Mining, SF

A series of ACM Data Mining events are coming up in the Bay Area…

ACM Data Mining Hackathon 9:00 am Sat, 2012-08-18 at Cloud Center in Sunnyvale. Kaggle programming challenge, BestBuy mobile data, $2000+ prizes
http://www.sfbayacm.org/DM-Hackathon-2012-10

ACM Data Mining Camp (free unconference) 9:00 am Sat, 2012-10-13 at eBay
http://www.sfbayacm.org/DMcamp2012

ACM Big Data Professional Development Seminar 8:30 am Sun, 2012-10-14 at eBay
http://www.sfbayacm.org/event/big-data-professional-development-seminar

Will be teaching a course about MapReduce programming. If you know Linux, Bash, and Python, we’ll show how to run large-scale Cascading apps from the command line.

Media Coverage from Cascading 2.0 Launch

Read what the media is saying about Concurrent and our recent release of Cascading.

CIO
6/6/12 – Ease Big Data Hiring Pain With Cascading
6/8/12 – 9 Open Source Big Data Technologies to Watch

Datanami
7/13/12 – Cascading Into Hadoop’s Golden Era

IT World
6/5/12 – Cascading 2.0 works to ease MapReduce pain

Cloud Times
6/25/12 – 9 Open Source Big Data Technologies Set to Change the Web

Cloud Computing Today
6/10/12 – Cascading 2.0 Streamlines Hadoop-based Big Data Analysis And Development

InfoWorld
6/19/12 – Hadoop becomes critical cog in the big data machine

451 Group
6/4/12 – Concurrent updates Cascading application framework for Hadoop

SiliconAngle
6/4/12 – Cascading 2.0: An Application Framework for Hadoop Winning the Attention of Twitter, Etsy and EMC

6/5/12 – Automation and Easier Aggregation in Hadoop Clusters Signals Data as a Service Trend

ZDNet
6/11/12 – Concurrent launches open source API to ease Hadoop development

Dr. Dobb’s
6/8/12 – Where Does Big Data Go To Get Data-Intensive?
6/25/12 – Open Season On Hadoop Big Data APIs

Information Management
6/7/12 – New Product News – June 7, 2012

MyNoSQL
6/6/12 – Cascading 2.0 Released

Concurrent Launches Cascading 2.0

FOR IMMEDIATE RELEASE

Concurrent Simplifies Big Data Application Development and Management on Hadoop 

Introduces Cascading 2.0, the leading Java application framework for building enterprise Big Data applications on Hadoop

SAN FRANCISCO – June 5, 2012 – Concurrent, Inc., the enterprise Big Data application platform company, introduces Cascading 2.0, the application framework designed to enable Java developers to quickly and easily build Big Data applications on Apache Hadoop. An alternative API to MapReduce, Cascading has been proven in mission-critical applications, and has the support of a growing ecosystem of developers, partners and customers around the world.

Introducing Concurrent, Inc.

Concurrent was founded in 2008 to address the challenges surrounding the development, deployment and management of Big Data applications. Concurrent CEO and Founder Chris Wensel is the author of the Cascading open source project for data processing. Previously, he co-founded Scale Unlimited, the first Hadoop and Big Data-related professional services and training company, where he mentored large companies including Sun Microsystems, Apple and several others in Silicon Valley.

Concurrent is the company behind Cascading, the leading Java development framework for building enterprise Big Data applications on Apache Hadoop.

Introducing Cascading 2.0

Cascading 2.0 is an application framework that enables Java developers to quickly and easily build robust data processing and data management applications on Apache Hadoop that can be deployed on clusters running in the cloud or within private data centers. Available under the Apache 2.0 License Agreement, Cascading offers an alternate API to MapReduce to simplify Big Data application development and deployment.

Today, mission-critical applications already depend on Cascading. Recognized companies like Airbnb, Etsy, FlightCaster, iCrossing, Razorfish, Trulia, TeleNav and Twitter are just some examples where Cascading is being used to streamline data processing, data filtering and workflow optimization for large volumes of unstructured and semi-structured data. With a quickly growing community around it, Cascading is also at the core of popular language extensions including PyCascading, Scalding and Cascalog (open source projects sponsored by Twitter) and tools including CloudFront LogAnalyzer (developed by Amazon).

The Cascading framework is designed for data scientists, Hadoop administrators and application developers alike, to collaborate and rapidly develop and deploy scalable Big Data applications. Using the Cascading 2.0 API:

  • Data scientists can easily discover, model and analyze both unstructured and semi-structured data in any format and from any source such as flat files, key value stores and NoSQL and relational databases.
  • Hadoop administrators can seamlessly move and scale application deployments from development to test and production clusters regardless of cluster location or data size.
  • Application developers can more quickly build and test applications on their desktops in the language of choice (Java, Jython, Scala, Clojure or Jruby) with familiar constructs and reusable components, and instantly deploy them onto clusters of hundreds of nodes.

Supporting Quotes

“Building applications on Hadoop, despite its growing adoption in the enterprise, is notoriously difficult. We are driving the future of application development and management on Hadoop, by allowing enterprises to quickly extract meaningful information from large amounts of distributed data and better understand the business implications. We make it easy for developers to build powerful data processing applications for Hadoop, without requiring months spent learning about the intricacies of MapReduce.”
-Chris Wensel, CEO and Founder, Concurrent, Inc.

“Cascading has proven to streamline complex development on Hadoop. We support the future of Big Data analytics, and technologies like Cascading that help drive more data-driven, predictive enterprises. We already distribute Cascading as part of our Greenplum MR distribution, and plan to increase our integration and support with other offerings in the future.”
-Mike Maxey, Senior Director of Product Marketing of Greenplum, a division of EMC

“MapR shares a commitment to the growing, innovative and rich Hadoop development community. Cascading is already integrated and distributed as part of our MapR Distribution, and is widely used across organizations that depend on Big Data analysis. Cascading lets enterprise developers focus on the business of applications and data processing, while handling the complexities of development.”
-John Schroeder, CEO and Co-Founder, MapR Technologies

“Microsoft is committed to compatibility with Apache Hadoop for our upcoming Hadoop-based services on Windows Server and in the Windows Azure cloud. In testing, Cascading on Windows Server worked directly out of the box and we are certifying Cascading 2.0 on Windows Server to give Microsoft customers a flexible Big Data application development framework for Hadoop that lets them build and deploy applications for Apache Hadoop on Windows Server and Windows Azure.”
-Bob Baker, Director and Partner, Channel Marketing, Microsoft

“Cetas is pleased to partner with Concurrent to facilitate the complex workflows typically performed in Hadoop environments for in-depth analytics.”
-Muddu Sudhakar, Vice President, Cloud and Big Data Analytics, VMware/Cetas

Supporting Resources

Availability and Pricing

Cascading 2.0 is available now, and freely licensable under the Apache 2.0 License Agreement. Concurrent offers standard and premium support subscriptions for enterprise use, with pricing based on number of users. To learn more about Concurrent’s offerings please visit http://www.concurrentinc.com/newsletter.

About Concurrent, Inc.

Concurrent, Inc. is the enterprise Big Data application platform company. Founded in 2008 by Chris Wensel, the author of the popular open source Cascading API, Concurrent simplifies Big Data application development, deployment and management on Apache Hadoop. Concurrent is based in San Francisco and funded by Rembrandt Venture Partners and True Ventures. Visit Concurrent online at http://www.concurrentinc.com.

Media Contact
Kelly Indrieri
Kulesa Faul for Concurrent, Inc.
+1 (650) 340 1983
concurrent@kulesafaul.com

New Case Study Posted – Twitter

Twitter has invested heavily in making Cascading a key component of their data analytics infrastructure. Cascading enables Twitter engineers to create complex data processing workflows in their favorite programming languages while providing the scalability to seamlessly handle terabyes and petabytes of data. See the full case study here.

New Case Study Posted – AirBnB

Airbnb is a trusted community marketplace for people to list, discover, and book unique accommodations around the world online or from a mobile phone. They chose Cascading because it provides developers more control when conducting advanced data analysis workflows, data normalization and cleansing. See the fullcase study here.