Microsoft Monday released an update of its Windows Azure cloud operating system with numerous new features, including a preview of an Apache Hadoop-based distribution for the platform that will add Big Data capabilities.
“To help bolster Big Data capabilities on Windows Azure, we are releasing a preview of the Hadoop-based distribution on Windows Azure,” said Bob Kelly, corporate vice president of Windows Azure marketing at Microsoft (NASDAQ:MSFT). “This preview enables you to easily run Hadoop projects on Windows Azure and achieve some unique benefits such as ease of use and advanced data analysis.”
Cloud Storage and Backup Benefits
Protecting your company’s data is critical. Cloud storage with automated backup is scalable, flexible and provides peace of mind. Cobalt Iron’s enterprise-grade backup and recovery solution is known for its hands-free automation and reliability, at a lower cost. Cloud backup that just works.
Kelly said the preview provides a new set of installers that simplify Hadoop setup and deployment, allowing customers to install and setup Hadoop on Windows in hours rather than days. Additionally, he said Microsoft has added new JavaScript libraries that will help JavaScript developers build MapReduce jobs for the platform, while the new Hive ODBC Driver and Hive Add-in for Excel enable data analysis of unstructured data using Excel and PowerPivot.
Apache Hadoop is a cross-platform software framework for supporting data-intensive distributed applications that work with thousands of nodes and petabytes of data. Created by Doug Cutting (then with Yahoo!, now with Cloudera), Hadoop built upon Google’s MapReduce algorithm, a framework for processing highly distributable problems across huge datasets using clusters or grids. Hadoop, now an open source Apache Software Foundation project, makes it possible to perform that work on large clusters of commodity hardware.
The list of Hadoop adopters reads like a who’s who of Webs-scale companies—including Facebook, Amazon, eBay and Twitter—and it is increasingly seen as an integral component of Big Data analytics: efforts to leverage massive datasets that exist as both structured and unstructured data.
Kelly said Microsoft is releasing the preview of the Hadoop-based distribution on Azure to a limited number of customers beginning this week. He said customers interested in trying it should fill out a Web form with details of the Big Data scenario, and Microsoft will issue access based on those usage scenarios.
The software giant also released an Azure SDK for Node.js. The download includes Node.js libraries for Windows Azure blob, table and queue storage. It also includes Windows Azure PowerShell for Node.js.
Microsoft also released a number of additional tools to help its platform-as-a-service (PaaS) developers work with open source software technologies. It has overhauled the Windows Azure Plugin for Eclipse with Java to add support for sticky sessions, pre-made startup scripts for popular Java servers, remote Java debugging, and more. It has added Azure integration with MongoDB, as well as a deployment package, documentation and code samples.
SQL Azure Federation provides built-in support for elastic scale-out of the data tier, and the company released a new SQL Database Federations specification under the Microsoft Open Specification Promise. It has added a set of code tools and configuration guidelines to help developers get the most out of running Solr on Azure. And it has provided guidance on how to deploy, run and tune memcached on Azure from non-.Net languages.
Thor Olavsrud is a contributor to InternetNews.com, the news service of Internet.com, the network for technology professionals.