Skip to main content

The Google File System - Google's core storage platform

Google File System - large distributed log structured file system in which they throw in a lot of data. Reliable scalable storage is a core need of any application. GFS is Google's core storage platform.Google File System (GFS) is a proprietary distributed file system developed by Google for its own use. Its point is both to assure reliablity by using redundant copies and to allow individual most used data to selectively receive more resources (more dedicated hardware or/and redundant copies).GFS is optimized for Google's core data storage needs, web searching, which can generate enormous amounts of data that needs to be retained; Google File System grew out of an earlier Google effort, "BigFiles", developed by Larry Page and Sergey Brin in the early days of Google, while it was still located in Stanford. The data is stored persistently, in very large, multiple gigabyte-sized files (around 100GB) which are only extremely rarely deleted, overwritten, or shrunk; files are usually appended to or read. It is also designed and optimized to run on Google's computing clusters, the nodes of which consist of cheap, "commodity" computers, which means precautions must be taken against the high failure rate of individual nodes and the subsequent data loss. Other design decisions select for high data throughputs, even when it comes at the cost of latency.

read more digg story

Comments

Popular posts from this blog

Comma separated list of values of single Database table field

Many times you need to create a comma seperated list of values in a table. Here is a line of T-SQL solution to get comma separated list of values of single field of a database table. DECLARE @commaSeparatedVal AS VARCHAR(MAX); SELECT @commaSeparatedVal = ISNULL(@commaSeparatedVal +',','') + CONVERT(VARCHAR,[SKU]) FROM PRODUCT PRINT @commaSeparatedVal

Why SharePoint 2007?

It is rare for a technology product to attract as much attention as SharePoint has in recent years. The industry has historically paid little attention to new product suites, particularly those related to web design. SharePoint products and technologies, however, have managed to excite and rejuvenate industry followers, causing them to take notice of the ease of use, scalability, flexibility, and powerful document management capabilities within the product. A number of organizational needs have spurred the adoption of SharePoint technologies. Some of the most commonly mentioned requirements include the following: A need for better document management than the file system can offer —This includes document versioning, check-out and check-in features, adding metadata to documents, and better control of document access (by using groups and granular security). The high-level need is simply to make it easier for users to find the latest version of the document or documents they need to do th

Creating Custom SharePoint Timer Jobs

In previous versions of SharePoint (or other platforms), if you had some task you wanted to perform on a scheduled basis, you'd have to either create a console EXE and schedule it to run via Windows Task Scheduler (ala AT.EXE) or create a Windows Service that went to sleep for a period of time. In order to install (and maintain) these tasks, you had to have console access to your production SharePoint (or other app) servers... something IT or admins wouldn't easily hand out. Addressing this issue, Microsoft has added something called timer jobs to Microsoft Office SharePoint Server (MOSS) 2007. Microsoft uses timer jobs to do things like dead web cleanup (purging unused sites from site collections) among others. To see what other timer jobs are out there, from Central Administration , click Operations and then Timer Job Definitions . Not only does Microsoft use timer jobs in MOSS, but you can create your own custom timer jobs to do your own scheduled tasks. What's nice a