Spring Data for Apache Hadoop

Spring Data - Apache Hadoop

Spring for Apache Hadoop simplifies developing Apache Hadoop by providing a unified configuration model and easy to use APIs for using HDFS, MapReduce, Pig, and Hive. It also provides integration with other Spring ecosystem project such as Spring Integration and Spring Batch enabling you to develop solutions for big data ingest/export and Hadoop workflow orchestration.

Check out the new book from O'Reilly Media Spring Data: Modern Data Access for Enterprise Java that contains several chapters on using Spring for Apache Hadoop. Sample code for the book is also available on github here.

#maven

Features

  • Support to create Hadoop applications that are configured using Dependency Injection and run as standard Java applications vs. using Hadoop command line utilities.
  • Create and configure applications that use Java MapReduce, Streaming, Hive, Pig, Cascading, or HBase
  • Extensions to Spring Batch to support creating Hadoop based workflows for any type of Hadoop Job or HDFS operation.
  • Script HDFS operations using any JVM based scripting language
  • DAO support (Template & Callbacks) for HBase
  • Cascading Taps for Spring & Spring Integration
  • Support for Hadoop Security

Latest News

  • Spring for Apache Hadoop 1.0.0 goes GA
  • Spring for Apache Hadoop 1.0.0.RC2 released
  • "Introducing Spring for Apache Hadoop" webinar announced for November 8th
  • Spring for Apache Hadoop featured in GigaOM A programmer's guide to big data: 12 tools to know
  • Spring for Apache Hadoop 1.0.0.RC1 released
  • Project Serengeti announced
  • Spring for Apache Hadoop 1.0.0.M2 released
  • Spring for Apache Hadoop Talk at Strata 2012 - Download Presentation
  • Spring for Apache Hadoop 1.0.0.M1 released
  • Introducing Spring for Apache Hadoop

    Resources

    Reference Documentation HTML PDF
    Javadocs HTML
    Issue Tracking JIRA
    Source Control GitHub
    Source Repository Browser Fisheye
    Build Status CI
    Forum Forum

    Development snapshot

    Reference Documentation HTML
    Javadocs HTML

     

    Maven Artifacts

    Maven Release Repository
    <repository>
    <-- Release -->
    <id>spring-release</id>
    <name>Spring Maven Release Repository</name>
    <url>http://repo.springframework.org/release</url>
    </repository>
    Maven Release Dependency
    <dependency>
    <groupId>org.springframework.data</groupId>
    <artifactId>spring-data-hadoop</artifactId>
    <version>1.0.0.RELEASE</version>
    </dependency>
    Maven Milestone Repository
    <repository>
    <-- Milestone/RC -->
    <id>spring-milestone</id>
    <name>Spring Maven Milestone Repository</name>
    <url>http://repo.springframework.org/milestone</url>
    </repository>
    Maven Milestone Dependency
    <dependency>
    <groupId>org.springframework.data</groupId>
    <artifactId>spring-data-hadoop</artifactId>
    <version>1.0.0.RC2</version>
    </dependency>
    Maven Snapshot Repository
    <repository>
    <-- Snapshots -->
    <id>spring-snapshot</id>
    <name>Spring Maven SNAPSHOT Repository</name>
    <url>http://repo.springframework.org/snapshot</url>
    </repository>
    Maven Snapshot Dependency
    <dependency>
    <groupId>org.springframework.data</groupId>
    <artifactId>spring-data-hadoop</artifactId>
    <version>1.0.0.BUILD-SNAPSHOT</version>
    </dependency>

     

    Latest GA release - 1.0.0.RELEASE

    Spring for Apache Hadoop

  • Spring Data

    Spring Data makes it easier to build Spring-powered applications that use new data access technologies such as non-relational databases, map-reduce frameworks, and cloud based data services as well as provide improved support for relational database technologies.

    Spring Data is an umbrella open source project which contains many subprojects that are specific to a given database. The projects are developed by working together with many of the companies and developers that are behind these exciting technologies.

    Now available from O'Reilly Media is the book:

    Spring Data: Modern Data Access for Enterprise Java.

     Spring Data

    Spring Data Projects:

    Category Sub-project  
    Relational Databases JPA Spring Data JPA - Simplifies the development of creating a JPA-based data access layer
      JDBC Extensions Support for Oracle RAC, Advanced Queuing, and Advanced datatypes. Support for using QueryDSL with JdbcTemplate.
         
    Big Data Apache Hadoop The Apache Hadoop project is an open-source implementation of frameworks for reliable, scalable, distributed computing and data storage.
         
    Data-Grid GemFire VMware vFabric GemFire is a distributed data management platform providing dynamic scalability, high performance, and database-like persistence. It blends advanced techniques like replication, partitioning, data-aware routing, and continuous querying.
         
    HTTP REST Spring Data REST - Perform CRUD operations of your persistence model using HTTP and Spring Data Repositories.
         
    Key Value Stores Redis Redis is an open source, advanced key-value store.
         
    Document Stores MongoDB MongoDB is a scalable, high-performance, open source, document-oriented database.
         
    Graph Databases Neo4j Neo4j is a graph database, a fully transactional database that stores data structured as graphs.
         
    Column Stores HBase Apache HBase is an open-source, distributed, versioned, column-oriented store modeled after Google' Bigtable. HBase functionality is part of the Spring for Apache Hadoop project.
         
    Common Infrastructure Commons Provides shared infrastructure for use across various data access projects. General support for cross-database persistence is located here
         

    Community-driven projects:

    Sub-project  
    Spring Data Solr Spring Data repositories abstraction for Apache Solr
    Spring Data Elasticsearch Spring Data repositories abstraction for Elasticsearch
    Spring Data Couchbase Spring Data repositories abstraction for Couchbase
    Spring Data FuzzyDB Spring Data repositories abstraction for FuzzyDB

    Attic

    The Attic contains links to Spring Data projects that are no longer maintained.

    Participation

    Please reach out on the forums or JIRA with specific questions,requests, and expressing interest to participate in development on github at http://github.com/SpringSource.

    Twitter

    Follow SpringData on Twitter: SpringData

    Follow the team members on Twitter

     

     

    Syndicate content