Hadoop Distributed File System (HDFS)

Developed and used as the primary storage system of the Hadoop framework, HDFS is capable of handling large files of application data which are split into blocks, replicated and stored on different nodes of the cluster, thus achieving fault-tolerance and high availability. HDFS is designed to be deployed on commodity hardware, with emphasis on batch processing and high throughput of data access rather than low latency of data access. It follows a write-once-read-many access model for files, which simplifies data coherency issues. HDFS uses a master/slave architecture, where a single master node, NameNode, manages the file system namespace and regulates access to files by clients. Although this architectural design is simple and effective, it introduces a single point of failure in an HDFS cluster. HDFS is widely used in production sites (Yahoo!, Facebook, Last.fm, etc [1]).

CEPH

References

  1. Applications and organizations using Hadoop

-- IoannisKonstantinou - 10 Jun 2010 -- ChristinaBoumpouka - 16 Jun 2010

Edit | Attach | Watch | Print version | History: r3 < r2 < r1 | Backlinks | Raw View | Raw edit | More topic actions...
Topic revision: r2 - 2010-06-16 - ChristinaBoumpouka
 

No permission to view TWiki.WebTopBar

This site is powered by the TWiki collaboration platform Powered by Perl

No permission to view TWiki.WebBottomBar