Skip to Content

Big Things, come with Big Changes; and hence, so does Big Data.

Whilst, learing SQL is on of the ABC for any student studing Databses and chosing a career in this path, it would be good for HANA developers to use NoSQL.

Carlo Strozzi used the term NoSQL in 1998 to name his lightweight, open-source relational database that did not expose the standard SQL interface.NoSQL databases are finding significant and growing industry use in big data and real-time web applications. One such NoSQL type is Project Voldermont.

Voldemort is a distributed data store that is designed as a key-value store used by LinkedIn for high-scalability storage. It is named after the fictional Harry Potter villain Lord Voldemort.

Voldemort is still under development. It is neither an object database, nor a relational database. It does not try to satisfy arbitrary relations and the ACID properties, but rather is a big, distributed, fault-tolerant, persistent hash table.

Voldemort offers a number of advantages over other databases:[

-It combines in-memory caching with the storage system so that a separate caching tier is not required (instead the storage system itself is just fast)

-It is possible to emulate the storage layer, as it is completely mockable. This makes the development and the unit testing easy, as it can be done against a throw-away in-memory storage system without the need for a real cluster or real storage system

-Reads and writes scale horizontally

-Simple API: The API decides data replication and placement and accommodates a wide range of application-specific strategies

-Transparent data partitioning: This allows for cluster expansion without rebalancing all data

The Voldemort distributed data store has following properties:

-Data placement: Support for pluggable data placement strategies exists to support things like distribution across data centers that are far apart.

-Data replication: The data is automatically replicated over a large number of servers.

-Data partitioning: The data is automatically partitioned so that the server contains only a subset of the total data

-Good single node performance: 10–20k operations per second can occur depending on the machines, the network, the disk system, and data replication factor

-Node independence: Each node is independent of other nodes with no central point of failure or coordination

-Pluggable serialization: This allows rich keys and values including lists and tuples with named fields, as well as the integration with common serialisation frameworks. Examples for these frameworks are Avro, Java Serialization, Protocol Buffers, and Thrift

-Transparent failures: Server failures are handled transparently so that the user doesn’t see such problems

-Versioning: The data items are versioned to maximize data integrity in case of failure without compromising availability of the system.

Let the world change .unconventionally…with HANA

To report this post you need to login first.

Be the first to leave a comment

You must be Logged on to comment or reply to a post.

Leave a Reply