Download - 2015 02-09 - NoSQL Vorlesung Mosbach
kein relationales Datenmodell (kein SQL)verteilte und horizontale Skalierbarkeitschemafrei / schwache Schemarestriktionenanderes Konsistenzmodelle
Anforderungenan ein verteiltes System
Consistency
Konsistenz
Availability
Verfügbarkeit
Partition ToleranceAusfalltoleranz
ConsistencyThe system is in a consistent state after an operationAll clients see the same dataStrong consistency (ACID)vs. eventual consistency (BASE)
ACID: Atomicity, Consistency, Isolation and Durability
BASE: Basically Available, Soft state, Eventually consistent
Availability
System is “always on”, no downtimeNode failure tolerance– all clients can find some available replicaSoftware/hardware upgrade tolerance
PartitiontoleranceSystem continues to function even when split into disconnected subsets (network disruption)Not only for reads, but writes as well
CAP Theorem CA› Single site clusters
(easier to ensure all nodes are always in contact)
› When a partition occurs, the system blocks
› e.g. usable for two-phase commits (2PC) which already require/use blocks
CAP Theorem CA› Single site clusters
(easier to ensure all nodes are always in contact)
› When a partition occurs, the system blocks
› e.g. usable for two-phase commits (2PC) which already require/use blocks
Obviously, any horizontal scaling strategy is based on data partitioning; therefore, we are forced to decide between consistency and availability.
CAP Theorem CP› Some data may be inaccessible (availability
sacrificed), but the rest is still consistent/accurate
› e.g. sharded database
CAP Theorem AP› System is still available under partitioning,
but some of the data returned my be inaccurate
› Need some conflict resolution strategy
› e.g. Master/Slave replication
KlassifizierungKey-Value stores RedisDocument stores MongoDB & RavenDBWide Column storesGraph-Datenbankenund viele weitere
GET & SETIn der Shell
› SET note1:title "Mittag"
› SET note1:message "nicht vergessen"
› KEYS note1:*
› GET note1:title
› DEL note1:title note1:message
RavenDbWritten by Oren Eini aka Ayende Rahien
› Hibernating Rhinos› Rhino Mocks & Rhino.ServiceBus
Written in C#
DeploymentGet it via NuGetChange defaults in Raven.Server.exe.config
› It’s safe by default
Just run the Raven.Server.exe in the /server/ folder
Safe by defaultUseful defaults
› E.g. Limited page size – No Accidental SELECT *
ACID (Transactional) *
Makes developers happy› Testable
› Interfaces all over› In-Memory Database
› Extensible – Plugin Support
In Memory InstanceEmbedded Mode
using (var documentStore = new EmbeddableDocumentStore{ RunInMemory = true}.Initialize()){ using (var session = documentStore.OpenSession()) { // Run complex test scenarious }}
APIs › Native .NET Client API
› HTTP API (Pseudo REST)
Indexes› Written as Linq Queries
› Indexed with Lucene .NET
› Lucene Syntax for querying
“While being RESTful is a goal of the HTTP API, it is secondary to the goal of exposing easy to use and powerful functionality”
Ayende Rahien on the HTTP API - http://ravendb.net/documentation/docs-http-api-restful
HTTP API› Caching
› E-Tags
› Lucene Queries possible
C:\>curl -X GET http://localhost:8080/docs/Categories/1 -iHTTP/1.1 200 OKContent-Type: application/json; charset=utf-8ETag: 00000000-0000-0200-0000-000000000004{
"Name" : "Normal Importance","Color" : "green"
}
Database Timeline
IBM’s IMS
Codd publishes relational model paper
in 1970
1966 1969 1970 1985 2000 2004 2007
Agile becoming more popular
1990’s 2009
CODASYL model published
Term “object-oriented database” appears
Brewer’s CAP born
Google BigTable
Amazon Dynamo
Apache Cassandra initial release
2008
MongoDB initial release
1973 1974
INGRES
SQL invented
1977
Oracle founded
10gen founded
NoSQL Movement
“Deployment”› Standardverzeichnis erstellen:
c:\data\db
› Server-Start: mongod.exe
› Shell: mongo.exe
CRUD – CreateIn der Shell› use WebNote
› db.Notes.save( { Title: 'Mittag', Message: 'nicht vergessen‘ });
So funktioniert der Befehl› db.Notes.save
CRUD – Create…with a bit JavaScript
for(i=0; i<1000; i++) { ['quiz', 'essay', 'exam'].forEach(function(name) { var score = Math.floor(Math.random() * 50) + 50; db.scores.save({student: i, name: name, score:
score}); }); } db.scores.count();
CRUD – ReadQueries werden ebenso im Dokument-Stil
spezifiziert
› db.Notes.find();
› db.Notes.find({ Title: /Test/i });
› db.Notes.find( { "Categories.Color": "red"}).limit(1);
CRUD – Update
› db.Notes.update({Title: 'Test'}, {'$set': {Categories: []}});
› db.Notes.update({Title: 'Test'}, {'$push': {
Categories: {Color: 'Red'} } });
Anforderungenan ein verteiltes System
Consistency
Konsistenz
Availability
Verfügbarkeit
Partition ToleranceAusfalltoleranz
kein relationales Datenmodell (kein SQL)verteilte und horizontale Skalierbarkeitschemafrei / schwache Schemarestriktionenanderes Konsistenzmodell
Data Import(hands-on.zip)
cd dump_trainingmongorestore -d training -c scores scores.bson
cd dump_diggmongorestore -d digg -c stories stories.bson
Exercises
1. Find all scores less than 65.
2. Find the lowest quiz score. Find the highest quiz score.
3. Write a query to find all digg stories where the view count is greater than 1000.
4. Query for all digg stories whose media type is either 'news' or 'images' and where the topic name is 'Comedy’.
5. Find all digg stories where the topic name is 'Television' or the media type is 'videos'. Skip the first 5 results, and limit the result set to 10.
CRUD – Update
› use digg;
› db.people.update({name: 'Smith'}, {'$set': {interests: []}});
› db.people.update({name: 'Smith'}, {'$push': {interests:
['chess']}});
Exercises
1. Set the proper 'grade' attribute for all scores. For example, users with scores greater than 90 get an 'A.' Set the grade to ‘B’ for scores falling between 80 and 90.
2. You're being nice, so you decide to add 10 points to every score on every “final” exam whose score is lower than 60. How do you do this update?
“MapReduce is the Uzi of aggregation tools. Everything described with count, distinct and group can be done with MapReduce, and more.”
Kristina Chadorow, Michael Dirolf in MongoDB – The Definitive Guide
MapReduceTo use map-reduce, you first write a map function.
var map = function() {emit(this.user.name, {diggs: this.diggs, posts: 0});
};
MapReduceThe reduce functions then aggregation those docs
by key.
var reduce = function(key, values) { var diggs = 0; var posts = 0; values.forEach(function(doc) { diggs += doc.diggs; posts += 1; }); return {diggs: diggs, posts: posts};};
MapReduceNow both are used to perform custom aggregation.
db.stories.mapReduce(map, reduce, {out: 'digg_users'});
db.digg_users.find();
“MapReduce is slower and is not supposed to be used in ‘real time’. You ran MapReduce as a background job.”
Kristina Chadorow, Michael Dirolf in MongoDB – The Definitive Guide
JSON BSON
All JSON documents are stored in a binary format called BSON. BSON supports a richer set of types than JSON.http://bsonspec.org
Terminologie
RDBMS MongoDB
Table Collection
Row(s) JSON Document
Index Index
Join Embedding & Linking
Partition Shard
Partition Key Shard Key
Vererbung - Tabelle
id type area radius length width
1 circle 3.14 1 NULL NULL
2 square 4 NULL 2 NULL
3 rect 10 NULL 5 2
Vererbung - Dokument
> db.shapes.find()
› { _id: "1", type: "c", area: 3.14, radius: 1}
› { _id: "2", type: "s", area: 4, length: 2}
› { _id: "3", type: "r", area: 10, length: 5, width: 2}
// Shapes mit radius > 0 finden> db.shapes.find( { radius: { $gt: 0 } } )
One to ManyEmbedded Array
blogs: { author : “Johannes", date : ISODate("2011-09-18T09:56:06.298Z"), comments : [
{author : “Klaus",date : ISODate("2011-09-
19T09:56:06.298Z"),text : “toller Artikel"
} ]}
One to ManyNormalisiert (2 Collections)
blogs: { _id: 1000, author: “Johannes", date: ISODate("2011-09-18"), comments: [ {comment : 1)} ]}
comments : { _id : 1, blog: 1000, author : “Klaus", date : ISODate("2011-09-19")}
> blog = db.blogs.find({ text: "Destination Moon" });> db.comments.find( { blog: blog._id } );
// Jedes Produkt verlinkt die IDs der Kategorienproducts:
{ _id: 10, name: "Destination Moon", category_ids: [ 20, 30 ] }
Many - Many
// Jedes Produkt verlinkt die IDs der Kategorienproducts:
{ _id: 10, name: "Destination Moon", category_ids: [ 20, 30 ] }
// Jede Kategorie verlinkt die IDs der Produktecategories: { _id: 20, name: "adventure", product_ids: [ 10, 11, 12 ] }
categories: { _id: 21, name: "movie", product_ids: [ 10 ] }
Many - Many
// Jedes Produkt verlinkt die IDs der Kategorienproducts:
{ _id: 10, name: "Destination Moon", category_ids: [ 20, 30 ] }
// Jede Kategorie verlinkt die IDs der Produktecategories: { _id: 20, name: "adventure", product_ids: [ 10, 11, 12 ] }
categories: { _id: 21, name: "movie", product_ids: [ 10 ] }
// Alle Kategorien für ein Produkt> db.categories.find( { product_ids: 10 } )
Many - Many
// Jedes Produkt verlinkt die IDs der Kategorienproducts:
{ _id: 10, name: "Destination Moon", category_ids: [ 20, 30 ] }
// Kategorien beinhalten keine Assoziationencategories: { _id: 20, name: "adventure"}
Alternative: Many - Many
// Jedes Produkt verlinkt die IDs der Kategorienproducts:
{ _id: 10, name: "Destination Moon", category_ids: [ 20, 30 ] }
// Kategorien beinhalten keine Assoziationencategories: { _id: 20, name: "adventure"}
// Alle Produkte für eine Kategorie> db.products.find( { category_ids: 20 } )
Alternative: Many - Many
// Jedes Produkt verlinkt die IDs der Kategorienproducts:
{ _id: 10, name: "Destination Moon", category_ids: [ 20, 30 ] }
// Kategorien beinhalten keine Assoziationencategories: { _id: 20, name: "adventure"}
// Alle Produkte für eine Kategorie> db.products.find( { category_ids: 20 } )
// Alle Kategorien für ein Produkt product> product = db.products.find( { _id: some_id } )> db.categories.find({_id: {$in : product.category_ids}})
Alternative: Many - Many
Usual Problemswith Integration Tests
comfortablemanagers, business constraints,pragmatic solutions, own laziness…Bugs will come back to haunt you!
4.
In Memory InstanceEmbedded Mode
using (var documentStore = new EmbeddableDocumentStore{ RunInMemory = true}.Initialize()){ using (var session = documentStore.OpenSession()) { // Run complex test scenarious }}
NoSQL: Einstieg in die Welt nicht-relationaler Web 2.0 Datenbanken
MongoDB:The Definitive Guide
MongoDB in ActionRavenDB Mythology Documentationhttps://s3.amazonaws.com/daily-builds/RavenDBMythology-11.pdf
Bildnachweise
Bug © 123RF Stock FotoCloud web © vege – Fotolia.comRace car - red and black © braverabbit – Fotolia.comPC - Computerkomponenten - Icons Nr. 1 © vanhorden – Fotolia.comDer Ordner © beermedia – Fotolia.comAusgewählter Ordner © Spectral-Design – Fotolia.comfunny cartoon builder © artenot – Fotolia.com3D rendering of an architecture model 2 © Franck Boston –
Fotolia.com
Alle verwendeten Logos und Markenzeichensind Eigentum ihrer eingetragenen Besitzer.