Skip to content

Software Development News: .NET, Java, PHP, Ruby, Agile, Databases, SOA, JavaScript, Open Source

Methods & Tools

Subscribe to Methods & Tools
if you are not afraid to read more than one page to be a smarter software developer, software tester or project manager!

Architecture

Microsoft Azure and Couchbase Hands on Lab (Detroit)

NorthScale Blog - Fri, 02/24/2017 - 02:17

Microsoft Azure and Couchbase are presenting a free hands-on lab “Lunch & Learn” on using NoSQL with Docker Containers.

  • When: Wednesday, March 8th, 2017 – 11:00am – 2:00pm
  • Where: Microsoft Technology Center
    1000 Town Center, Suite 250, Room MPR3
    Southfield, MI 48075

Sign up today to reserve your seat.

Event details

On Wednesday March 8th, Microsoft and Couchbase are holding a joint Lunch & Learn from 11:00 am to 2:00 pm to introduce you to the fundamentals of today’s quickly maturing NoSQL technology. Specifically we will show you have easy it is to add Couchbase to the Azure Cloud or Hybrid Cloud environment.

Whether you are new to NoSQL technologies or have had experience with Couchbase, we hope you can join this informative session showcasing how the world’s leading companies are utilizing Couchbase’s NoSQL solutions to power their mission-critical applications on Azure.

During our Lunch & Learn, we’ll discuss:

  • The basics of using Couchbase NoSQL for Azure cloud or hybrid cloud environments
  • Using Containers – Couchbase Server on Azure
  • Why leading organizations are using Azure & Couchbase with their modern web and mobile applications
  • Provisioning VMs in Azure and setting up Couchbase
  • Good general practices for Couchbase Server on Azure

Register Now to reserve your seat, and please share this invitation with your coworkers or anyone else who might be interested. If you have any questions, please leave a comment, email me at matthew.groves@couchbase.com, or contact me on Twitter @mgroves.

You may want to try out Couchbase on Azure before you come to the lab: you can find the latest Couchbase Server 4.6.0 release in the Azure marketplace.

The post Microsoft Azure and Couchbase Hands on Lab (Detroit) appeared first on The Couchbase Blog.

Categories: Architecture, Database

Migrating Your MongoDB with Mongoose RESTful API to Couchbase with Ottoman

NorthScale Blog - Mon, 02/20/2017 - 20:54

When talking to Node.js developers, it is common to hear about NoSQL as the database of choice for development. JavaScript and JSON come hand in hand because after all JSON stands for JavaScript Object Notation. This is a format most common in document oriented databases which Node.js developers tend to use.

A very popular stack of development technologies is the MongoDB, Express Framework, Angular, and Node.js (MEAN) stack, but similarly there is also the Couchbase, Express Framework, Angular, and Node.js (CEAN) stack. Now don’t get me wrong, every technology I listed is great, but when your applications need to scale and maintain their performance, you might have better luck with Couchbase because of how it functions by design.

So what if you’re already using MongoDB in your Node.js application?

Chances are you’re using Mongoose which is an Object Document Model (ODM) for interacting with the database. Couchbase also has an ODM and it is called Ottoman. The great thing about these two ODM technologies is that they share pretty much the same set of APIs, making any transition incredibly easy.

We’re going to see how to take a MongoDB with Mongoose driven Node.js application and migrate it to Couchbase, using Ottoman.

The Requirements

This tutorial is going to be a little different because of all the technologies involved. We’re going to be building everything from scratch for simplicity, so the following are requirements and recommendations:

We’re going to start by building a MongoDB with Mongoose RESTful API in Node.js, hence the Node.js and MongoDB requirement. Then we’re going to take this application and migrate it to Couchbase.

For purposes of this example, we won’t be seeing how to configure Node.js, MongoDB, or Couchbase Server.

Understanding our NoSQL Data Model

Both MongoDB and Couchbase are document databases. One stores BSON data and the other stores JSON, however from the developer perspective they are incredibly similar. That said, let’s design a few models based around students attending courses at a school. The first model we create might be for actual courses, where a single course might look like the following:

{
    "id": "course-1",
    "type": "course",
    "name": "Computer Science 101",
    "term": "F2017",
    "students": [
        "student-1",
        "student-2"
    ]
}

In the above, notice that the course has a unique id, and we’ve defined it as being a course. The course has naming information as well as a list of students that are enrolled.

Now let’s say we want to define our model for student documents:

{
    "id": "student-1",
    "type": "student",
    "firstname": "Nic",
    "lastname": "Raboy",
    "courses": [
        "course-1",
        "course-25"
    ]
}

Notice that the above model has a similar format to that of the courses. What we’re saying here is that both documents are related, but still semi-structured. We’re saying that each course keeps track of its students and each student keeps track of their courses. This is useful when we try to query the data.

There are unlimited possibilities when it comes to modeling your NoSQL data. In fact there are probably more than one hundred ways to define a courses and students model, versus what I had decided. It is totally up to you, and that is the flexibility that NoSQL brings. More information on data modeling can be found here.

With a data model in mind, we can create a simple set of API endpoints using each MongoDB with Mongoose and Couchbase with Ottoman.

Developing an API with Express Framework and MongoDB

Because we’re, in theory, migrating away from MongoDB to Couchbase, it would make sense to figure out what we want in a MongoDB application first.

Create a new directory somewhere on your computer to represent the first part of our project. Within this directory, execute the following:

npm init --y
npm install express body-parser mongodb mongoose --save

The above commands will create a file called package.json that will keep track of each of the four project dependencies. The express dependency is for Express framework and the body-parser dependency allows for request bodies to exist in POST, PUT, and DELETE requests, all of which are common for altering data. Then mongodb and mongoose are required for working with the database.

The project we build will have the following structure:

app.js
routes/
    courses.js
    students.js
models/
    models.js
package.json
node_modules/

Go ahead and create those directories and files if they don’t already exist. The app.js file will be the application driver where as the routes will contain our API endpoints and the models will contain the database definitions for our application.

Defining the Mongoose Schemas

So let’s work backwards, starting with the Mongoose model that will communicate with MongoDB. Open the project’s models/models.js file and include the following:

var Mongoose = require("mongoose");
var Schema = Mongoose.Schema;
var ObjectId = Mongoose.SchemaTypes.ObjectId;

var CourseSchema = new Mongoose.Schema({
    name: String,
    term: String,
    students: [
        {
            type: ObjectId,
            ref: StudentSchema
        }
    ]
});

var StudentSchema = new Mongoose.Schema({
    firstname: String,
    lastname: String,
    courses: [
        {
            type: ObjectId,
            ref: CourseSchema
        }
    ]
});

module.exports.CourseModel = Mongoose.model("Course", CourseSchema);
module.exports.StudentModel = Mongoose.model("Student", StudentSchema);

In the above we’re creating MongoDB document schemas and then creating models out of them. Notice how similar the schemas are to the JSON models that we had defined previously outside of the application. We’re not declaring an id and type because the ODM handles this for us. In each of the arrays we use a reference to another schema. What we’ll see at save is a document id, but we can leverage the querying technologies to load that id into actual data.

So how do we use those models?

Creating the RESTful API Routes

Now we want to create routing information, or in other words, API endpoints. For example, let’s create all the CRUD endpoints for course information. In the project’s routes/courses.js file, add the following:

var CourseModel = require("../models/models").CourseModel;

var router = function(app) {

    app.get("/courses", function(request, response) {
        CourseModel.find({}).populate("students").then(function(result) {
            response.send(result);
        }, function(error) {
            response.status(401).send({ "success": false, "message": error});
        });
    });

    app.get("/course/:id", function(request, response) {
        CourseModel.findOne({"_id": request.params.id}).populate("students").then(function(result) {
            response.send(result);
        }, function(error) {
            response.status(401).send({ "success": false, "message": error});
        });
    });

    app.post("/courses", function(request, response) {
        var course = new CourseModel({
            "name": request.body.name
        });
        course.save(function(error, course) {
            if(error) {
                return response.status(401).send({ "success": false, "message": error});
            }
            response.send(course);
        });
    });

}

module.exports = router;

In the above example we have three endpoints. We can view all available courses, view courses by id, and create new courses. Each endpoint is powered by Mongoose.

app.post("/courses", function(request, response) {
    var course = new CourseModel({
        "name": request.body.name
    });
    course.save(function(error, course) {
        if(error) {
            return response.status(401).send({ "success": false, "message": error});
        }
        response.send(course);
    });
});

When creating a document, the request POST data is added to a new model instantiation. Once save is called, it get’s saved to MongoDB. Similar things happen when reading data from the database.

app.get("/courses", function(request, response) {
    CourseModel.find({}).populate("students").then(function(result) {
        response.send(result);
    }, function(error) {
        response.status(401).send({ "success": false, "message": error});
    });
});

In the case of the above, the find function is called and parameters are passed in. When there are no parameters, then all documents are returned from the Course collection, otherwise data is queried by the properties passed. The populate function allows the document references to be loaded so instead of returning id values back, the actual documents are returned.

Now let’s take a look at the other route.

The second route is responsible for creating student data, but there is an exception here. We’re also going to be managing the document relationships here. Open the project’s routes/students.js file and include the following source code:

var CourseModel = require("../models/models").CourseModel;
var StudentModel = require("../models/models").StudentModel;

var router = function(app) {

    app.get("/students", function(request, response) {
        StudentModel.find({}).populate("courses").then(function(result) {
            response.send(result);
        }, function(error) {
            response.status(401).send({ "success": false, "message": error});
        });
    });

    app.get("/student/:id", function(request, response) {
        StudentModel.findOne({"_id": request.params.id}).populate("courses").then(function(result) {
            response.send(result);
        }, function(error) {
            response.status(401).send({ "success": false, "message": error});
        });
    });

    app.post("/students", function(request, response) {
        var student = new StudentModel({
            "firstname": request.body.firstname,
            "lastname": request.body.lastname
        });
        student.save(function(error, student) {
            if(error) {
                return response.status(401).send({ "success": false, "message": error});
            }
            response.send(student);
        });
    });

    app.post("/student/course", function(request, response) {
        CourseModel.findOne({"_id": request.body.course_id}).then(function(course) {
            StudentModel.findOne({"_id": request.body.student_id}).then(function(student) {
                if(course != null && student != null) {
                    if(!student.courses) {
                        student.courses = [];
                    }
                    if(!course.students) {
                        course.students = [];
                    }
                    student.courses.push(course._id);
                    course.students.push(student._id);
                    student.save();
                    course.save();
                    response.send(student);
                } else {
                    return response.status(401).send({ "success": false, "message": "The `student_id` or `course_id` was invalid"});
                }
            }, function(error) {
                return response.status(401).send({ "success": false, "message": error});
            });
        }, function(error) {
            return response.status(401).send({ "success": false, "message": error});
        });
    });

}

module.exports = router;

The first three API endpoints should look familiar. The new endpoint student/course is responsible for adding students to a course and courses to a student.

The first thing that happens is a course is found based on a request id. Next, a student is found based on a different request id. If both documents are found then the ids are added to each of the appropriate arrays and the documents are saved once again.

The final step here is to create our application driver. This will connect to the database and serve the application to be consumed by clients.

Connecting to MongoDB and Serving the Application

Open the project’s app.js file and add the following code:

var Mongoose = require("mongoose");
var Express = require("express");
var BodyParser = require("body-parser");

var app = Express();
app.use(BodyParser.json());

Mongoose.Promise = Promise;
var studentRoutes = require("./routes/students")(app);
var courseRoutes = require("./routes/courses")(app);
Mongoose.connect("mongodb://localhost:27017/example", function(error, database) {
    if(error) {
        return console.log("Could not establish a connection to MongoDB");
    }
    var server = app.listen(3000, function() {
        console.log("Connected on port 3000...");
    });
});

In the above code we are importing each of the dependencies that we previously installed. Then we are initializing Express and telling it to accept JSON bodies in requests.

The routes that were previously created need to be linked to Express, so we’re importing them and passing the Express instance. Finally, a connection to MongoDB is made with Mongoose and the application starts serving.

Not particularly difficult right?

Developing an API with Express Framework and Couchbase

So we saw how to create an API with Node.js, Mongoose, and MongoDB, so now we need to accomplish the same thing with Node.js, Ottoman, and Couchbase. Again this is to show how easy it is to transition from MongoDB to Couchbase and get all the benefits of an Enterprise ready, powerful NoSQL database.

Create a new directory somewhere on your computer and within it, execute the following to create a new project:

npm init --y
npm install express body-parser couchbase ottoman --save

The above commands are similar to what we saw previously, with the exception that now we’re using Couchbase and Ottoman. The project we build will have exactly the same structure, and as a refresher, it looks like the following:

app.js
routes/
    courses.js
    students.js
models/
    models.js
package.json
node_modules/

All Ottoman models will exist in the models directory, all API endpoints and Ottoman logic will exist in the routes directory and all driver logic will exist in the app.js file.

Defining the Ottoman Models

We’re going to work in the same direction that we did for the MongoDB application to show the ease of transition. This means starting with the Ottoman models that will represent our data in Couchbase Server.

Open the project’s models/models.js file and include the following:

var Ottoman = require("ottoman");

var CourseModel = Ottoman.model("Course", {
    name: { type: "string" },
    term: { type: "string" },
    students: [
        {
            ref: "Student"
        }
    ]
});

var StudentModel = Ottoman.model("Student", {
    firstname: { type: "string" },
    lastname: { type: "string" },
    courses: [
        {
            ref: "Course"
        }
    ]
});

module.exports.StudentModel = StudentModel;
module.exports.CourseModel = CourseModel;

The above should look familiar, yet you have to realize that these are two very different ODMs. Instead of designing MongoDB schemas through Mongoose we can go straight to designing JSON models for Couchbase with Ottoman. Remember there are no schemas in Couchbase Buckets.

Each Ottoman model has a set of properties and an array referencing other documents. While the syntax is slightly different, it accomplishes the same thing.

This brings us to the API endpoints that use these models.

Creating the RESTful API Endpoints

The first set of endpoints that we want to create are in relation to managing courses. Open the project’s routes/courses.js file and include the following JavaScript code:

var CourseModel = require("../models/models").CourseModel;

var router = function(app) {

    app.get("/courses", function(request, response) {
        CourseModel.find({}, {load: ["students"]}, function(error, result) {
            if(error) {
                return response.status(401).send({ "success": false, "message": error});
            }
            response.send(result);
        });
    });

    app.get("/course/:id", function(request, response) {
        CourseModel.getById(request.params.id, {load: ["students"]}, function(error, result) {
            if(error) {
                return response.status(401).send({ "success": false, "message": error});
            }
            response.send(result);
        });
    });

    app.post("/courses", function(request, response) {
        var course = new CourseModel({
            "name": request.body.name
        });
        course.save(function(error, result) {
            if(error) {
                return response.status(401).send({ "success": false, "message": error});
            }
            response.send(course);
        });
    });

}

module.exports = router;

In the above code we have three endpoints structured in a near identical way to what we saw with MongoDB and Mongoose. However, there are some minor differences. For example, instead of using promises we’re using callbacks.

One of the more visible differences is how querying is done. Not only do we have access to a find function like we saw in Mongoose, but we also have access to a getById function. In both scenarios we can pass information on how we expect a query to happen. Instead of using a populate function we can use load and provide which reference documents we wish to load. The concepts between Mongoose and Ottoman are very much the same.

This brings us to our second set of routes. Open the project’s routes/students.js file and include the following JavaScript code:

var StudentModel = require("../models/models").StudentModel;
var CourseModel = require("../models/models").CourseModel;

var router = function(app) {

    app.get("/students", function(request, response) {
        StudentModel.find({}, {load: ["courses"]}, function(error, result) {
            if(error) {
                return response.status(401).send({ "success": false, "message": error});
            }
            response.send(result);
        });
    });

    app.get("/student/:id", function(request, response) {
        StudentModel.getById(request.params.id, {load: ["courses"]}, function(error, result) {
            if(error) {
                return response.status(401).send({ "success": false, "message": error});
            }
            response.send(result);
        });
    });

    app.post("/students", function(request, response) {
        var student = new StudentModel({
            "firstname": request.body.firstname,
            "lastname": request.body.lastname
        });
        student.save(function(error, result) {
            if(error) {
                return response.status(401).send({ "success": false, "message": error});
            }
            response.send(student);
        });
    });

    app.post("/student/course", function(request, response) {
        CourseModel.getById(request.body.course_id, function(error, course) {
            if(error) {
                return response.status(401).send({ "success": false, "message": error});
            }
            StudentModel.getById(request.body.student_id, function(error, student) {
                if(error) {
                    return response.status(401).send({ "success": false, "message": error});
                }
                if(!student.courses) {
                    student.courses = [];
                }
                if(!course.students) {
                    course.students = [];
                }
                student.courses.push(CourseModel.ref(course._id));
                course.students.push(StudentModel.ref(student._id));
                student.save(function(error, result) {});
                course.save(function(error, result) {});
                response.send(student);
            });
        });
    })
}

module.exports = router;

We already know the first three endpoints are going to be of the same format. We want to pay attention to the last endpoint which manages our relationships.

With this endpoint we are obtaining a course by its id value and a student based on its id value. As long as both return a document, we can add a reference of each to each of their arrays and re-save the document. The same thing and nearly the same code was found in the Mongoose version.

Now we can look at the logic to start serving the application after connecting to the database.

Connecting to Couchbase and Serving the Application

Open the project’s app.js file and include the following JavaScript:

var Couchbase = require("couchbase");
var Ottoman = require("ottoman");
var Express = require("express");
var BodyParser = require("body-parser");

var app = Express();
app.use(BodyParser.json());

var bucket = (new Couchbase.Cluster("couchbase://localhost")).openBucket("example");
Ottoman.store = new Ottoman.CbStoreAdapter(bucket, Couchbase);
var studentRoutes = require("./routes/students")(app);
var courseRoutes = require("./routes/courses")(app);
var server = app.listen(3000, function() {
    console.log("Connected on port 3000...");
});

Does the above look familiar? It should! We are just swapping out the Mongoose connection information with the Couchbase connection information. After connecting to the database we can start serving the application.

Conclusion

You just saw how to build a RESTful API with Node.js, Mongoose, and MongoDB, then bring it to Couchbase in a very seamless fashion. This was meant to prove that the migration process is nothing to be scared about if you’re using Node.js as your backend technology.

With Couchbase you have a high-performance, distributed NoSQL database that works at any scale. The need to use caching in front of your database is eliminated because it is built into Couchbase. For more information on using Ottoman, you can check out a previous blog post I wrote. More information on using Couchbase with Node.js can be found in the Couchbase Developer Portal.

The post Migrating Your MongoDB with Mongoose RESTful API to Couchbase with Ottoman appeared first on The Couchbase Blog.

Categories: Architecture, Database

New Profiling and Monitoring in Couchbase Server 5.0 Preview

NorthScale Blog - Mon, 02/20/2017 - 20:07

N1QL query monitoring and profiling updates are just some of goodness you can find in February’s developer preview release of Couchbase Server 5.0.0.

Go download the February 5.0.0 developer release of Couchbase Server today, click the “Developer” tab, and check it out. You still have time to give us some feedback before the official release.

As always, keep in mind that I’m writing this blog post on early builds, and some things may change in minor ways by the time you get the release.

What is profiling and monitoring for?

When I’m writing N1QL queries, I need to be able to understand how well (or how badly) my query (and my cluster) is performing in order to make improvements and diagnose issues.

With this latest developer version of Couchbase Server 5.0, some new tools have been added to your N1QL-writing toolbox.

N1QL Writing Review

First, some review.

There are multiple ways for a developer to execute N1QL queries.

In this post, I’ll be mainly using Query Workbench.

There are two system catalogs that are already available to you in Couchbase Server 4.5 that I’ll be talking about today.

  • system:active_request – This catalog lists all the currently executing active requests or queries. You can execute the N1QL query SELECT * FROM system:active_requests; and it will list all those results.

  • system:completed_requests – This catalog lists all the recent completed requests (that have run longer than some threshold of time, default of 1 second). You can execute SELECT * FROM system:completed_requests; and it will list these queries.

New to N1QL: META().plan

Both active_requests and completed_requests return not only the original N1QL query text, but also related information: request time, request id, execution time, scan consistency, and so on. This can be useful information. Here’s an example that looks at a simple query (select * from travel-sample) while it’s running by executing select * from system:active_requests;

{
	"active_requests": {
	  "clientContextID": "805f519d-0ffb-4adf-bd19-15238c95900a",
	  "elapsedTime": "645.4333ms",
	  "executionTime": "645.4333ms",
	  "node": "10.0.75.1",
	  "phaseCounts": {
		"fetch": 6672,
		"primaryScan": 7171
	  },
	  "phaseOperators": {
		"fetch": 1,
		"primaryScan": 1
	  },
	  "phaseTimes": {
		"authorize": "500.3µs",
		"fetch": "365.7758ms",
		"parse": "500µs",
		"primaryScan": "107.3891ms"
	  },
	  "requestId": "80787238-f4cb-4d2d-999f-7faff9b081e4",
	  "requestTime": "2017-02-10 09:06:18.3526802 -0500 EST",
	  "scanConsistency": "unbounded",
	  "state": "running",
	  "statement": "select * from `travel-sample`;"
	}
}

First, I want to point out that phaseTimes is a new addition to the results. It’s a quick and dirty way to get a sense of the query cost without looking at the whole profile. It gives you the overall cost of each request phase without going into detail of each operator. In the above example, for instance, you can see that parse took 500¬Ķs and primaryScan took 107.3891ms. This might be enough information for you to go on without diving into META().plan.

However, with the new META().plan, you can get very detailed information about the query plan. This time, I’ll execute SELECT *, META().plan FROM system:active_requests;

[
  {
    "active_requests": {
      "clientContextID": "75f0f401-6e87-48ae-bca8-d7f39a6d029f",
      "elapsedTime": "1.4232754s",
      "executionTime": "1.4232754s",
      "node": "10.0.75.1",
      "phaseCounts": {
        "fetch": 12816,
        "primaryScan": 13231
      },
      "phaseOperators": {
        "fetch": 1,
        "primaryScan": 1
      },
      "phaseTimes": {
        "authorize": "998.7µs",
        "fetch": "620.704ms",
        "primaryScan": "48.0042ms"
      },
      "requestId": "42f50724-6893-479a-bac0-98ebb1595380",
      "requestTime": "2017-02-15 14:44:23.8560282 -0500 EST",
      "scanConsistency": "unbounded",
      "state": "running",
      "statement": "select * from `travel-sample`;"
    },
    "plan": {
      "#operator": "Sequence",
      "#stats": {
        "#phaseSwitches": 1,
        "kernTime": "1.4232754s",
        "state": "kernel"
      },
      "~children": [
        {
          "#operator": "Authorize",
          "#stats": {
            "#phaseSwitches": 3,
            "kernTime": "1.4222767s",
            "servTime": "998.7µs",
            "state": "kernel"
          },
          "privileges": {
            "default:travel-sample": 1
          },
          "~child": {
            "#operator": "Sequence",
            "#stats": {
              "#phaseSwitches": 1,
              "kernTime": "1.4222767s",
              "state": "kernel"
            },
            "~children": [
              {
                "#operator": "PrimaryScan",
                "#stats": {
                  "#itemsOut": 13329,
                  "#phaseSwitches": 53319,
                  "execTime": "26.0024ms",
                  "kernTime": "1.3742725s",
                  "servTime": "22.0018ms",
                  "state": "kernel"
                },
                "index": "def_primary",
                "keyspace": "travel-sample",
                "namespace": "default",
                "using": "gsi"
              },
              {
                "#operator": "Fetch",
                "#stats": {
                  "#itemsIn": 12817,
                  "#itemsOut": 12304,
                  "#phaseSwitches": 50293,
                  "execTime": "18.5117ms",
                  "kernTime": "787.9722ms",
                  "servTime": "615.7928ms",
                  "state": "services"
                },
                "keyspace": "travel-sample",
                "namespace": "default"
              },
              {
                "#operator": "Sequence",
                "#stats": {
                  "#phaseSwitches": 1,
                  "kernTime": "1.4222767s",
                  "state": "kernel"
                },
                "~children": [
                  {
                    "#operator": "InitialProject",
                    "#stats": {
                      "#itemsIn": 11849,
                      "#itemsOut": 11848,
                      "#phaseSwitches": 47395,
                      "execTime": "5.4964ms",
                      "kernTime": "1.4167803s",
                      "state": "kernel"
                    },
                    "result_terms": [
                      {
                        "expr": "self",
                        "star": true
                      }
                    ]
                  },
                  {
                    "#operator": "FinalProject",
                    "#stats": {
                      "#itemsIn": 11336,
                      "#itemsOut": 11335,
                      "#phaseSwitches": 45343,
                      "execTime": "6.5002ms",
                      "kernTime": "1.4157765s",
                      "state": "kernel"
                    }
                  }
                ]
              }
            ]
          }
        },
        {
          "#operator": "Stream",
          "#stats": {
            "#itemsIn": 10824,
            "#itemsOut": 10823,
            "#phaseSwitches": 21649,
            "kernTime": "1.4232754s",
            "state": "kernel"
          }
        }
      ]
    }
  }, ...
]

The above output comes from the Query Workbench.

Note the new “plan” part. It contains a tree of operators that combine to execute the N1QL query. The root operator is a Sequence, which itself has a collection of child operators like Authorize, PrimaryScan, Fetch, and possibly even more Sequences.

Enabling the profile feature

To get this information when using cbq or the REST API, you’ll need to turn on the “profile” feature.

You can do this in cbq by entering set -profile timings; and then running your query.

You can also do this with the REST API on a per request basis (using the /query/service endpoint and passing a querystring parameter of profile=timings, for instance).

You can turn on the setting for the entire node by making a POST request to http://localhost:8093/admin/settings, using Basic authentication, and a JSON body like:

{
  "completed-limit": 4000,
  "completed-threshold": 1000,
  "controls": false,
  "cpuprofile": "",
  "debug": false,
  "keep-alive-length": 16384,
  "loglevel": "INFO",
  "max-parallelism": 1,
  "memprofile": "",
  "pipeline-batch": 16,
  "pipeline-cap": 512,
  "pretty": true,
  "profile": "timings",
  "request-size-cap": 67108864,
  "scan-cap": 0,
  "servicers": 32,
  "timeout": 0
}

Notice the profile setting. It was previously set to off, but I set it to “timings”.

You may not want to do that, especially on nodes being used by other people and programs, because it will affect other queries running on the node. It’s better to do this on a per-request basis.

It’s also what Query Workbench does by default.

Using the Query Workbench

There’s a lot of information in META().plan about how the plan is executed. Personally, I prefer to look at a simplified graphical version of it in Query Workbench by clicking the “Plan” icon (which I briefly mentioned in a previous post about the new Couchbase Web Console UI).

Query Workbench plan results

Let’s look at a slightly more complex example. For this exercise, I’m using the travel-sample bucket, but I have removed one of the indexes (DROP INDEX travel-sample.def_sourceairport;).

I then execute a N1QL query to find flights between San Francisco and Miami:

SELECT r.id, a.name, s.flight, s.utc, r.sourceairport, r.destinationairport, r.equipment
FROM `travel-sample` r
UNNEST r.schedule s
JOIN `travel-sample` a ON KEYS r.airlineid
WHERE r.sourceairport = 'SFO'
AND r.destinationairport = 'MIA'
AND s.day = 0
ORDER BY a.name;

Executing this query (on my single-node local machine) takes about 10 seconds. That’s definitely not an acceptible amount of time, so let’s look at the plan to see what the problem might be (I broke it into two lines so the screenshots will fit in the blog post).

Query Workbench plan part 1

Query Workbench plan part 2

Looking at that plan, it seems like the costliest parts of the query are the Filter and the Join. JOIN operations work on keys, so they should normally be very quick. But it looks like there are a lot of documents being joined.

The Filter (the WHERE part of the query) is also taking a lot of time. It’s looking at the sourceairport and destinationairport fields. Looking elsewhere in the plan, I see that there is a PrimaryScan. This should be a red flag when you are trying to write performant queries. PrimaryScan means that the query couldn’t find an index other than the primary index. This is roughly the equivalent of a “table scan” in relational database terms. (You may want to drop the primary index so that these issues get bubbled-up faster, but that’s a topic for another time).

Let’s add an index on the sourceairport field and see if that helps.

CREATE INDEX `def_sourceairport` ON `travel-sample`(`sourceairport`);

Now, running the same query as above, I get the following plan:

Query Workbench improved plan part 1

Query Workbench improved plan part 2

This query took ~100ms (on my single-node local machine) which is much more acceptible. The Filter and the Join still take up a large percentage of the time, but thanks to the IndexScan replacing the PrimaryScan, there are many fewer documents that those operators have to deal with. Perhaps the query could be improved even more with an additional index on the destinationairport field.

Beyond Tweaking Queries

The answer to performance problems is not always in tweaking queries. Sometimes you might need to add more nodes to your cluster to address the underlying problem.

Look at the PrimaryScan information in META().plan. Here’s a snippet:

"~children": [
  {
    "#operator": "PrimaryScan",
    "#stats": {
      "#itemsOut": 13329,
      "#phaseSwitches": 53319,
      "execTime": "26.0024ms",
      "kernTime": "1.3742725s",
      "servTime": "22.0018ms",
      "state": "kernel"
    },
    "index": "def_primary",
    "keyspace": "travel-sample",
    "namespace": "default",
    "using": "gsi"
  }, ... ]

The servTime value indicates how much time is spent by the Query service to wait on the Key/Value data storage. If the servTime is very high, but there is a small number of documents being processed, that indicates that the indexer (or the key/value service) can’t keep up. Perhaps they have too much load coming from somewhere else. So this means that something weird is running someplace else or that your cluster is trying to handle too much load. Might be time to add some more nodes.

Similarly, the kernTime is how much time is spent waiting on other N1QL routines. This might mean that something else downstream in the query plan has a problem, or that the query node is overrun with requests and are having to wait a lot.

We want your feedback!

The new META().plan functionality and the new Plan UI combine in Couchbase Server 5.0 to improve the N1QL writing and profiling process.

Stay tuned to the Couchbase Blog for information about what’s coming in the next developer build.

Interested in trying out some of these new features? Download Couchbase Server 5.0 today!

We want feedback! Developer releases are coming every month, so you have a chance to make a difference in what we are building.

Bugs: If you find a bug (something that is broken or doesn’t work how you’d expect), please file an issue in our JIRA system at issues.couchbase.com or submit a question on the Couchbase Forums. Or, contact me with a description of the issue. I would be happy to help you or submit the bug for you (my Couchbase handlers high-five me every time I submit a good bug).

Feedback: Let me know what you think. Something you don’t like? Something you really like? Something missing? Now you can give feedback directly from within the Couchbase Web Console. Look for the feedback icon icon at the bottom right of the screen.

In some cases, it may be tricky to decide if your feedback is a bug or a suggestion. Use your best judgement, or again, feel free to contact me for help. I want to hear from you. The best way to contact me is either Twitter @mgroves or email me matthew.groves@couchbase.com.

The post New Profiling and Monitoring in Couchbase Server 5.0 Preview appeared first on The Couchbase Blog.

Categories: Architecture, Database

Timestamp-based conflict resolution in XDCR ‚Äď a QE‚Äôs perspective

NorthScale Blog - Mon, 02/20/2017 - 15:01

Introduction

Cross Datacenter Replication (XDCR)  is an important core feature of Couchbase that helps users in disaster recovery and data locality. Conflict resolution is an inevitable challenge faced by XDCR when a document is modified in two different locations before it has been synchronized between the locations.

Until 4.6, Couchbase only supported a revision ID-based strategy to handle conflict resolution. In this strategy, a document‚Äôs revision ID, which is updated every time it is modified, is used as the first field to decide the winner. If the revision ID of both contestants are same, then CAS, TTL, and flags are used in the same order to resolve the conflict. This strategy works best for applications designed to work based on a ‚ÄúMost updates is best‚ÄĚ policy. For example, a ticker app used by conductors in a train which updates a counter stored by cb server to count the number of passengers will work best with this policy and hence perform accurately with revision ID-based conflict resolution.

Starting with 4.6, Couchbase will be supporting an additional strategy called timestamp-based conflict resolution. Here, the timestamp of a document which is stored in CAS is used as the first field to decide the winner. In order to keep consistent ordering of mutations, Couchbase uses a hybrid logical clock (HLC), which is a combination of a physical clock and a logical clock. If the timestamp of both the contestants are same, then revision ID, TTL, and flags are used in the same order to resolve conflicts. This strategy is adapted to facilitate applications which are designed based on a ‚ÄúMost recent update is best‚ÄĚ policy. For example, a flight tracking app that stores the estimated arrival time of a flight in Couchbase Server will perform accurately with this conflict resolution. Precisely, this mechanism can be summarized as ‚ÄúLast write wins‚ÄĚ.

One should understand that the most updated document need not essentially be the most recent document and vice versa. So the user really needs to understand the application’s design, needs, and data pattern before deciding on which conflict resolution mechanism to use. For the same reason, Couchbase has designed the conflict resolution mechanism as a bucket level parameter. The users need to decide and select the strategy they wish to follow while creating the buckets. Once a bucket is created with a particular conflict resolution mechanism via UI, Rest API, or CLI, it cannot be changed. The user will have to delete and recreate the bucket to change the strategy. Also to avoid confusions and complications, Couchbase has restricted XDCR from being set up in mixed mode, i.e source and destination buckets cannot have different conflict resolution strategies selected. They both have to use either revision ID-based, or timestamp based conflict resolution. If the user tries to set up otherwise via UI, Rest API, or CLI, there will be an error message displayed.

Timestamp-based conflict resolution Use Cases

High Availability with Cluster Failover

Here, all database operations go to Datacenter A and are replicated via XDCR to Datacenter B. If the cluster located in Datacenter A fails then the application fails all traffic over to Datacenter B.

Datacenter Locality

Here, two active clusters operate on discrete sets of documents. This ensures no conflicts are generated during normal operation. A bi-directional XDCR relationship is configured to replicate their updates to each other. When one cluster fails, application traffic can be failed over to the remaining active cluster.

How does Timestamp-based conflict resolution ensure safe failover?

Timestamp-based conflict resolution requires that applications only allow traffic to the other Datacenter after the maximum of the following two time periods has elapsed:

  1. The replication latency between A and B. This allows any mutations in-flight to be received by Datacenter B.
  2. The absolute time skew between Datacenter A and Datacenter B. This ensures that any writes to Datacenter B occur after the last write to Datacenter A, after the calculated delay, at which point all database operations would go to Datacenter B.

When availability is restored to Datacenter A, applications must observe the same time period before redirecting their traffic. For both of the use cases described above, using timestamp-based conflict resolution ensures that the most recent version of any document will be preserved.

How to configure NTP for Timestamp-based conflict resolution?

A prerequisite that users should keep in mind before opting for timestamp-based conflict resolution is that they need to use synchronized clocks to ensure the accuracy of this strategy. Couchbase advises them to use Network Time Protocol (NTP) to synchronize time across multiple servers. The users will have to configure their clusters to periodically synchronize their wall clocks with a particular NTP server or a pool of NTP peers to ensure availability. Clock synchronization is key to the accuracy of the Hybrid Logical Clock used by Couchbase to resolve conflicts based on timestamps.

As a QE, testing timestamp-based conflict resolution was a good learning experience. One of the major challenges was learning how NTP works. The default setup for all the testcases is to enable NTP, start the service, sync up the wall clock with 0.north-america.pool.ntp.org, and then proceed with the test. These steps were achieved using the following commands in setup:

~$ chkconfig ntpd on

~$ /etc/init.d/ntpd start

~$ ntpdate -q 0.north-america.pool.ntp.org

Once the test is done and results are verified, NTP service is stopped and disabled using the following commands:

~$ chkconfig ntpd off

~$ /etc/init.d/ntpd stop

This is vanilla setup where all the individual nodes sync up their wall clock with 0.north-america.pool.ntp.org. It was interesting to automate test cases where nodes sync up their wall clock with a pool of NTP peers, source and destination cluster sync with different NTP pools (A (0.north-america.pool.ntp.org) -> B (3.north-america.pool.ntp.org)) and each cluster in a chain topology of length 3 (A (EST)  -> B (CST) -> C (PST)) are in different timezones. We had to manually configure these scenarios, observe the behaviour and then automate it out.

How did we test NTP based negative scenarios?

Next challenge was to test scenarios where NTP is not running on the Couchbase nodes and there is a time skew between the source and destination. Time skew might also occur If the wall clock time difference across clusters is high. Any time synchronization mechanism will take some time to sync the clocks resulting in a time skewed window. Note that Couchbase only gives an advisory warning while creating a bucket with timestamp-based conflict resolution stating that the user should ensure a time synchronization mechanism is in place in all the nodes. It does not validate and restrict users from creating such a bucket if a time synchronization mechanism as such is not present. So it is quite possible that the user might ignore this warning, create a bucket with timestamp based conflict resolution and see weird behaviour when there is a time skew.

Let us consider one such situation here:

  1. Create default bucket on source and target cluster with timestamp based conflict resolution
  2. Setup XDCR from source to target
  3. Disable NTP on both clusters
  4. Make wall clock of target cluster slower than source cluster by 5 minutes
  5. Pause replication
  6. Create a doc D1 at time T1 in target cluster
  7. Create a doc D2 with same key at time T2 in source cluster
  8. Update D1 in target cluster at time T3
  9. Resume replication
  10. Observe that D2 overwrites D1 even though T1 > T2 > T3 and last update to D1 in target cluster should have won

Here last write by timeline did not win as the clocks were skewed and not in sync leading to incorrect doc being declared as the winner. This shows how important time synchronization is for the timestamp based conflict resolution strategy. Figuring out all such scenarios and automating them was indeed a challenge.

How did we test complex scenarios with Timestamp-based Conflict Resolution?

Up next was determining a way to validate the correctness of this timestamp based conflict resolution against revision ID based strategy. We needed to perform the same steps in a XDCR setup and verify that the results were different based on the bucket’s conflict resolution strategy. In order to achieve this, we created two different buckets, one configured to use revID based conflict resolution and other to use timestamp based. Now follow these steps on both buckets parallely:

  1. Setup XDCR and pause replication
  2. Create doc D1 in target at time T1
  3. Create doc D2 with same key in source at time T2
  4. Update doc D2 in source at time T3
  5. Update doc D2 in source again at time T4
  6. Update doc D1 in target at time T5
  7. Resume replication

In the first bucket which is configured to use revID based conflict resolution, doc D1 at target will be overwritten by D2 as it has been mutated the most. Whereas in the second bucket which is configured to use timestamp based conflict resolution, doc D1 at target will be declared winner and retained as it is the latest to be mutated. Figuring out such scenarios and automating them made our regression exhaustive and robust.

How did we test HLC correctness?

Final challenge was to test the monotonicity of the hybrid logical clock (HLC) used by Couchbase in timestamp based conflict resolution. Apart from verifying that the HLC remained the same between an active vbucket and its replica, we had some interesting scenarios as follows:

  1. C1 (slower) -> C2 (faster) – mutations made in C1 will lose based on timestamp and C2 will always win – so HLC of C2 should not change after replication
  2. C1 (faster) -> C2 (slower) – mutations made in C1 will always win based on timestamp – so HLC of C2 should be greater than what it was before replication due to monotonicity
  3. Same scenario as 1, even though HLC of C2 did not change due to replication, any updates on C2 should increase its HLC owing to monotonicity
  4. Similarly, for scenario described in 2, apart from C2’s HLC being greater than what it was before replication, more updates to docs on C2 should keep its HLC increasing owing to monotonicity

Thus, all these challenges made testing timestamp based conflict resolution a rewarding QE feat.

The post Timestamp-based conflict resolution in XDCR – a QE’s perspective appeared first on The Couchbase Blog.

Categories: Architecture, Database

Managing Secrets in Couchbase 4.6

NorthScale Blog - Mon, 02/20/2017 - 07:55

Every software application has secrets. Password, API Key, secure Tokens all fall into category of secrets. ¬†There are dire consequences if your production secret keys would get into the wrong hands. You’ll want to tightly control how and when your secret keys are accessible.

Couchbase has added more services to its infrastructure and these services have internal and external credentials, and storing credentials of these services is a challenge.  Another challenge is rotation of secrets for all internal and external services.

Couchbase 4.6 introduces management of secrets, where all secrets shared are encrypted when stored and passed correctly to nodes and services, along with ease of rotation of secrets.  There is not going to impact on any SDK client or UI and performance.

Couchbase maintains 2 levels of key hierarchy to allow easier in rotating master password without re-encoding data, supporting multiple master passwords and also will be easier to integrate with KMIP server.At startup of node, master password is created or is supplied by the user,  from which a master key is created using a strong Key Derivation Function. Couchbase uses PBKDF2 for generation key.

A random data_key is also created on server startup which is then encrypted with master key.  All secrets are encrypted using data_key on disk.  Couchbase uses an AES 256-bit algorithm in GCM mode to encrypt secrets.

To bootstrap the system, the master key is used to open the encrypted data key. The decrypted data key is then used to open the encrypted secrets, and the secrets are used to start Couchbase Server.  Couchbase recommends using a strong master password.

With Secret Management in 4.6, you can rotate your secrets at different levels of the key hierarchy periodically or in the event of a breach.

Master password rotation/first level of rotation and reset of password can be done using REST API or cli. Couchbase allows flexibility of setting one master password per node.  In case if the master password is lost and server is stopped, the node is lost. Data from node can be recovered using other tools with server.

Second level of rotation can be done by changing the data key using the REST API or cli.  

All rotation and setting of master password is audited by application.

An example of setting up server for master password using cli on ubuntu 14.

  • Install and configure couchbase server.
  • Setup master password using cli, execute the command below and pass password on prompt.

/opt/couchbase/bin/couchbase-cli master-password -c 192.168.0.1:8091 -u Administrator -p password –new-password

  • Stop the server – /etc/init.d/couchbase-server stop
  • Configure an environment variable

export CB_MASTER_PASSWORD=<password>

  • Start the server – /etc/init.d/couchbase-server start

Note if you are using sudo to start the server use -E option to sudo to start the server.

  • Rotate the data key using cli, execute the command below:

/opt/couchbase/bin/couchbase-cli master-password -c ¬†¬†¬†¬†192.168.0.1:8091 -u Administrator -p password –rotate-data-key

  • For changing the master password, execute the command below. Pass password on the prompt

/opt/couchbase/bin/couchbase-cli master-password -c 192.168.0.1:8091 -u Administrator -p password –new-password

 

Logging Details:

 

Babysitter log on password change [ns_server:info,2017-01-20T13:12:30.079Z,babysitter_of_ns_1@127.0.0.1:encryption_service<0.65.0>:encryption_service:call_gosecrets_and_store_data_key:227]Master password change succeded Babysitter log on incorrect master password during server start or env. variable is set incorrect [ns_server:error,2017-01-20T13:13:07.066Z,babysitter_of_ns_1@127.0.0.1:encryption_service<0.65.0>:encryption_service:init:174]Incorrect master password. Error: {error,”cipher: message authentication failed”} Babysitter log when master password is set correct for CB Server =========================PROGRESS REPORT=========================

supervisor: {local,ns_babysitter_sup}

started: [{pid,<0.65.0>},

{name,encryption_service},

{mfargs,{encryption_service,start_link,[]}},

{restart_type,permanent},

{shutdown,1000}

{child_type,worker}]

[ns_server:debug,2017-01-22T12:08:46.432Z,babysitter_of_ns_1@127.0.0.1:<0.70.0>:supervisor_cushion:init:39]starting ns_port_server with delay of 5000

The post Managing Secrets in Couchbase 4.6 appeared first on The Couchbase Blog.

Categories: Architecture, Database

Deploy Docker Container to Oracle Container Cloud Service

NorthScale Blog - Sat, 02/18/2017 - 16:07

Getting Started with Oracle Container Cloud Service¬†explained how to get started with Oracle’s managed¬†container service.¬†Well, the intent was to show how to get started but getting to “getting started” was itself quite involving. And now this blog will really¬†show how to run a simple Docker container¬†to Oracle Container Service.

Oracle Container Service is built upon Oracle’s StackEngine acquisition that was completed 1.5 years ago. The basis for this blog is going to be a 4 node cluster (1 manager and 3 workers) created following the steps in Getting Started with Oracle Container Cloud Service.

Make sure you note down the user name and password of the service specified during the creation. It is golden, and there is no way to either retrieve it or reset it afterwards. UPDATE: @kapmani clarified that the password can be reset by logging into the manager node.

Anyway, the dashboard looks like:

Oracle Cloud Dashboard

Similarly, Container Cloud Console with 4 nodes looks like:

Container Cloud Service is accessible using REST API as explained in About Oracle Container Cloud Service REST API. The console itself uses the REST API for fulfilling all the commands.

Oracle Container Cloud Service Concepts

Let’s learn about some concepts¬†first:

  • Service – Service comprises of the necessary configuration for running a Docker image as a container on a host, plus default deployment directives. Service is neither container nor image running in containers. It is¬†a high-level configuration objects that you can create, deploy, and manage using Oracle Container Cloud Service. Think of a service as a container ‚Äėtemplate‚Äô, or as a set of instructions to follow to deploy a running container.
  • Stack –¬†Stack is all the¬†necessary configuration for running a set of services as Docker containers in a coordinated way and managed as a single entity, plus default deployment directives. Think of it as multi-container application. Stacks themselves are neither containers nor images running in containers, but rather are high-level configuration objects that you can create, deploy, and manage using Oracle Container Cloud Service. For example, a stack might be one or more WildFly¬†containers and a Couchbase¬†container. Likewise, a cluster of database or application nodes can be built as a stack.
  • Deployment¬†– A Deployment comprises a service or stack in which Docker containers are managed, deployed, and scaled according to a set of orchestration rules that you‚Äôve defined.¬†A single deployment can result in the creation of one or many Docker containers, across one or many hosts in a resource pool.
  • Resource Pool –¬†Resource pools are a way to organize hosts and combine them into isolated groups of compute resources. Resource pools enable you to manage your Docker environment more effectively by deploying services and stacks efficiently across multiple hosts.Three resource pools are defined out of the box:
    Oracle Cloud Default Resource Pool

Rest of the terms like Containers, Images and Hosts are pretty straight forward.

Run Couchbase in Oracle Container Cloud Service
  • Click on Services,¬†New Service
  • Oracle Container Service only supports Compose v2. So a simple Compose file definition¬†can be used for service definition:
    version: "2"
    services:
      db:
        image: arungupta/couchbase
        ports:
          - 8091:8091
          - 8092:8092
          - 8093:8093
          - 11210:11210

    The image arungupta/couchbase is built from github.com/arun-gupta/docker-images/tree/master/couchbase. It uses Couchbase REST API to pre-configure the Couchbase server. The common Couchbase networking ports for application development are also exposed.

    In the YAML tab, use the Compose definition from above:
    Docker Container Oracle Cloud
    Alternatively, you can use the builder or docker run command as well. In our case, use the Compose definition and then specify the Service Description.

  • Click on Save to save the service definition.¬†The updated list now includes Couchbase service:
    Oracle Cloud Couchbase Service
  • Click on Deploy to¬†deploy the container:
    Oracle Cloud Deploy Couchbase
  • Take the defaults and click on Deploy to start the deployment.
  • The Docker image is download and the container is started. The screen is refreshed to show Deployments:Oracle Cloud Deployments Couchbase
    A single instance of the container is now up and running,  Other details like resource pool, hostname and uptime are also displayed.
Details about Couchbase Container in Oracle Cloud

Let’s get some details about the Couchbase¬†container in Oracle Cloud:

  • Click on the container name shown in Container Name column¬†to see more details about the container:
    Oracle Cloud Containers Couchbase Details
    The typical output that you’ll inspect from the docker inspect command is shown here.
  • Click on View Logs to see¬†the container logs:
    Oracle Cloud Containers Couchbase Logs
    This is equivalent to the docker container logs command.
    These logs are generated from when Couchbase REST API is configuring the server.
  • Click on Hosts to see the complete list of hosts:
    Oracle Cloud Container Couchbase Hosts
  • A single instance of container is running. Select the host that is running the container to see more details:
    Oracle Cloud Containers Couchbase Host Details
    Note the public_ip of the host. This IP address will be used to access Couchbase Web Console later.Another key part to note here is that this host is running Docker 1.10.3. That is the case with other hosts as well, as expected.
Access Couchbase

Now, let’s access the Couchbase Web Console. In our case, this is¬†available at 129.152.159.64:8091. This shows the main login screen as:

Oracle Cloud Couchbase Web Console

Use Administrator as Username and password as password, click on Sign In to see the main screen of the console:

Oracle Cloud Couchbase Web Console Main Screen

Click on Server Nodes to see that data, index and query services are running:

Oracle Cloud Couchbase Web Console Server Nodes

Pretty cool, eh!

A future blog post will show how to create a Couchbase cluster, run a simple application against this cluster and other fun stuff.

Use any of the Couchbase Starter Kits to get started with Couchbase.

Want to learn more about running Couchbase in containers?

The post Deploy Docker Container to Oracle Container Cloud Service appeared first on The Couchbase Blog.

Categories: Architecture, Database

SDK Features ‚Äď New For Couchbase 4.6

NorthScale Blog - Fri, 02/17/2017 - 08:02

Along with this week’s Couchbase Server 4.6 release we have a super¬†packed release with several new SDK features to¬†help you streamline development. From efficiently managed Data Structures to the latest support for .NET Core, it is time to update to the latest libraries! ¬†We have also released significant updates to our Big Data connectors for Spark and Kafka.

Data Structures

By bringing Native Collection bindings to the Couchbase SDK, it is now even easier to map your document data into structures your language understands.  All the languages support it through simple functions and .NET and Java have extra special support using their Collections Frameworks.  Structures include List, Map, Set, and Queue Рeach with specific functions for add/remove, push/pop and more.

They are built to be as efficient as possible as well.  Behind the scenes it uses our network-friendly sub-document processes keeping traffic to a minimum while making atomic updates to documents on the server Рall while you simply update the collections in your code.

No extra upserts or pulling down the whole document just to modify part of an array.  This is a great way to reduce the amount of document handling you need to do in your application.

.NET Core Integration

Microsoft’s push to cross platform development via .NET Core is extremely important for our community. ¬†So we wanted to make sure you could get .Net Core support for Couchbase as soon as possible. ¬†All .NET applications will benefit from moving to this latest platform – especially for those wanting¬†cross operating system support straight out-of-the-box.

For example, write apps on Windows, deploy on OS X and Linux without having to change your code.  

As usual we push all our .NET libraries to NuGet to make it as simple as possible to integrate Couchbase into your application.

There are way more improvements in the latest .NET SDK release – read about them in the release notes.

Kafka 3.x Updates

Couchbase integration with Kafka has taken a major leap forward.  The 3.x updates bring support for both Sink and Source connector options, allowing you to read from and write to Couchbase using Kafka.  You can also easily process Couchbase events using Kafka Streams technology.

To help simplify development and deployment there is now Kafka Connect support Рplug and play without having to write custom connectors between your Buckets and Topics.  This is especially easy via integration with Confluent Control Center Рproviding many powerful features, including real time monitoring, through a web UI.

Other features worth checking out include Dynamic Topology for rebalance and failover and much more.

Spark 2.x Updates

As with Kafka, our Spark connector has had many significant improvements recently.  The latest improvements include support for Spark 2.0 and related features.  We have even implemented some of the latest leading edge improvements including Structure Streaming (both source and sink!).  Dynamic Topology is now supported to help with failover and rebalance needs in an easy manner.

Other Language Updates

There are many other updates across the Couchbase SDK this month Рcheck out the latest changes in each of them below.  Now is the time to upgrade!

Release notes: .NETJavaNode.js РGo РPHPPythonC

You can keep informed of these releases by following the projects in Github but a better way is to sign up to our Community Newsletter Рkeep informed of new releases, blogs and community training events that show off the latest new features.

The post SDK Features – New For Couchbase 4.6 appeared first on The Couchbase Blog.

Categories: Architecture, Database

Data Structures: Native Collections New in Couchbase 4.6

NorthScale Blog - Fri, 02/17/2017 - 06:40

Data Structures in Couchbase 4.6 is our newest time-saving SDK feature.  These allow your client applications to easily map your array-based JSON data into language specific structures.

Leveraging native collections support in Couchbase will save you time and hassle:

  • Easily map JSON arrays into language specific structures
  • Couchbase Server manages the document efficiently – automatically using sub-document calls
  • You choose the data structure type you need and start coding

Support for Data Structures is available for all our languages: Java, .NET, Node.js, Go, PHP, Python, and C.  Including powerful Java and .NET implementations via Collections Frameworks and all other languages have a wide range of functional options.

This post shows how to get started using Data Structures, with specific examples in Java (using the Map type) and Python (using List and Queue types).  Video and reference links follow below.

Couchbase Data Structure Types

Four specific kinds of structures have been added to Couchbase client libraries: Map, List, Set, and Queue.  They are all variants of JSON arrays in the database but presented as native types to your client application.

  • List – an array that stores values in order
  • Map ¬†– also known as a dictionary¬†– stores values by key
  • Set – a variant of list that only retains unique combination of values
  • Queue – a variant of a list that offers push and pop operations to add/remove items from the queue in a first-in-first-out (FIFO) manner
Java Collections Examples – Map & List

The Java and .NET APIs have the tightest native Collections interfaces.  This short example edits a user profile document as a Map and adds or updates the email contact information.

As the Map gets updated, so does the Document in the background – no manual saving or upserting is required!

See many more beautiful Couchbase .NET Data Structures examples in Matthew Grove’s blog post.

Map<String, String> userInfo = new CouchbaseMap<String>("user:mnunberg", bucket); 
userInfo.put("email", "mark.nunberg@couchbase.com");

Similarly the List is accessible through the CouchbaseArrayList and easily appended to.

List<String> ll = new CouchbaseArrayList<String>("user:mnunberg_list", bucket); 
ll.add("msg1"); 
ll.add("msg2");

Python Data Structures Examples – Queue

Here is a simple message Queue in Python, including a dictionary of timestamp, sender and some content.  Populate the queue using push to put new messages into it and then use pop to retrieve the first or next entry in the queue, while also removing it from the queue.

All this is done automatically behind the scenes when you use these functions.  No additional calls to the server are required to save the changed Queue.

>>> cb.queue_push("messages::tyler", {'timestamp': 1485389293, 'from':'user::mark', 'content':'Dear Tyler'}, create=True) 
>>> cb.queue_push("messages::tyler", {'timestamp': 1486390293, 'from':'user::jody', 'content':'Dear John...'}) 
>>> cb.queue_pop("messages::tyler").value 

{u'content': u'Dear Tyler', u'timestamp': 1485389293, u'from': u'user::mark'}

Python Data Structures Examples –¬†List

The following example shows a simplified Python example using the List type.  In each case a new document is also created at the same time that it is populated with values.  See the Couchbase Python documentation for examples of the other types.

In an IoT use case you may have sensors recording specific timestamped activities and related data values.  Here, a sensor has its own document and a vehicle ID and timestamp are recorded when detected by the sensor.

>>> cb.list_append("garage1", ['vehicle::1A', '2017-01-24 08:02:00'], create=True) 
>>> cb.list_append("garage1", ['vehicle::2A', '2017-01-24 10:21:00']) 
>>> cb.list_append("garage1", ['vehicle::1A', '2017-01-25 17:16:00'])

The resulting document is an array with each entry holding two values in an array.

[ [ "vehicle::1A", "2017-01-24 08:02:00" ],
  [ "vehicle::2A", "2017-01-24 10:21:00" ],
  [ "vehicle::1A", "2017-01-25 17:16:00" ] ]

Retrieving the values into a Python list is done easily.¬† Just grab the document and it’s instantly available to iterate over.

>>> garage1 = cb.get('garage1') 
>>> for rec in garage1.value: print rec 

[u'vehicle::1A', u'2017-01-24 08:02:00'] 
[u'vehicle::2A', u'2017-01-24 10:21:00'] 
[u'vehicle::1A', u'2017-01-25 17:16:00']

Next Step

As you can see, the syntax is easy and predictable. By offloading management of these structures to Couchbase Server it simplifies a lot of the communications required to manage dynamic documents. In no time you can be using Couchbase 4.6 as a Data Structure server for your applications.

 

The post Data Structures: Native Collections New in Couchbase 4.6 appeared first on The Couchbase Blog.

Categories: Architecture, Database

Introducing Couchbase .NET 2.4.0 ‚Äď .NET Core GA

NorthScale Blog - Fri, 02/17/2017 - 01:38

This release is the official GA release for .NET Core support for the Couchbase .NET SDK! .NET Core is the latest incarnation of the .NET framework and its described as¬†“.NET Core is a blazing fast, lightweight and modular platform for creating web applications and services that run on Windows, Linux and Mac”

Wait a minute…read that again:¬†“.NET Core is a blazing fast, lightweight and modular platform for creating web applications and services that¬†run on Windows, Linux and Mac. Microsoft .NET applications running on OSX and Linux? What kind of bizzaro world are we living in? It‚Äôs the “New” Microsoft for sure!

In this blog post, I’ll go over what is in the 2.4.0 release, changes to packaging (NuGet), and what version of .NET the SDK supports. We‚Äôll also demonstrate some of the new features such as Datastructures.

What’s in this release?

2.4.0 is a large release with over 30 commits. When you consider that we released 3 Developer Previews leading up to 2.4.0, there are actually many, many more commits leading up to this release over the last 6 months. Here is an overview of some of the more impressive features – you can see all of the commits in the “Release Notes” section below:

.NET Core Support

Of course the most significant feature of 2.4.0 is .NET Core support, which from the opening paragraph, means you can now develop on Mac OS or Windows and deploy to Linux (or vice-versa, but the tooling is a bit immature still). This is great stuff and a major change for the traditional Windows developer.

If you’re unaware of .NET Core, you can read up more about it over on the .NET Core website. One cool thing about it is that it’s open source (Apache 2.0) and source is all available on Github.

The Couchbase SDK specifically supports the netstandard1.5 or greater. We tested the SDK using 1.0.0-preview2-1-003177 of the Command Line Tools.

Packaging changes

Just like the three developer previews, the NuGet package will contain binaries for both the .NET Full Framework (targeting .NET 4.5 or greater), but also for .NET Core (targeting .NET Core 1.1). Depending on the target project you are including the dependency for, the correct binaries will be used.

So, if your Visual Studio project is a .NET Full Framework application greater than or equal to 4.5, you’ll get the binaries for the full framework version of .NET. Likewise, if your application is a .NET Core application, then the .NET Core version of the binaries will be used. There should be nothing you have to do to enable this.

The older .NET 4.5 version of the packages will no longer be released; 2.3.11 is the last supported release of the 2.3.X series.

MS Logging for Core

For .NET Core we decided to change from using Common.Logging to MS Logging mainly because no 3rd party (log4net for example) have stable support for .NET Core at this time.

Additionally, by moving from

Common.Logging
¬† to MS Logging we have removed one more 3rd party dependency – which is always nice. Not that Common.Logging wasn’t sufficient, but it makes more sense to use a dependency from Microsoft.

Here is an example of configuring the 2.4.0 client targeting .NET Core and using NLog:

First add the dependencies to the project.json:

{
  "version": "1.0.0-*",
  "buildOptions": {
    "emitEntryPoint": true,
    "copyToOutput": {
      "include": [ "config.json", "nlog.config" ]
    }
  },

  "dependencies": {
    "CouchbaseNetClient": "2.4.0-dp6",
    "NLog.Extensions.Logging": "1.0.0-rtm-beta1",
    "Microsoft.NETCore.App": {
      "type": "platform",
      "version": "1.0.1"
    },
    "Microsoft.Extensions.Logging.Debug": "1.1.0",
    "Microsoft.Extensions.Logging": "1.1.0"
  },

  "frameworks": {
    "netcoreapp1.0": {
      "imports": "dnxcore50"
    }
  }
}

Then, add a nlog.config file to your project with the following contents:

<?xml version="1.0" encoding="utf-8" ?>
<nlog xmlns="http://www.nlog-project.org/schemas/NLog.xsd"
      xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance"
      autoReload="true"
      internalLogLevel="Debug"
      internalLogFile="c:\temp\internal-nlog.txt">

  <!-- define various log targets -->
  <targets>
    <!-- write logs to file -->
    <target xsi:type="File" name="allfile" fileName="c:\temp\nlog-all-${shortdate}.log"
                layout="${longdate}|${event-properties:item=EventId.Id}|${logger}|${uppercase:${level}}|${message} ${exception}" />

    <target xsi:type="Null" name="blackhole" />
  </targets>

  <rules>
    <!--All logs, including from Microsoft-->
    <logger name="*" minlevel="Trace" writeTo="allfile" />
  </rules>
</nlog>

Finally, add the code to configure the Couchbase SDK for logging:

using Couchbase;
using Couchbase.Logging;
using Microsoft.Extensions.Logging;
using NLog.Extensions.Logging;

namespace ConsoleApp2
{
    public class Program
    {
        public static void Main(string[] args)
        {
            var factory = new LoggerFactory();
            factory.AddDebug();
            factory.AddNLog();
            factory.ConfigureNLog("nlog.config");

            //configure logging on the couchbase client
            var config = new ClientConfiguration
            {
                LoggerFactory = factory
            };

            var cluster = new Cluster(config);
            //use the couchbase client
        }
    }
}

Note that the project.json has a

copyToOutput.include
  value for
nlog.config
 . This is required so the tooling will copy that file to the output directory when built.

Now for the .NET 4.5 Full Framework binaries, the dependency on

Common.Logging
 remains and any existing logging configuration should work as it always has.

Datastructures

Datastructures are a new way of working with Couchbase documents as if they are a common Computer Science data structures such as lists, queues, dictionaries or sets. There are two implementations in the SDK; one as a series of methods on

CouchbaseBucket
  which provide functionality for common data structure operations and another as implementations of the interfaces within
System.Collections.Generics
 . Here is a description of each Datastructure class found in the SDK:

  • CouchbaseDictionary<TKey, TValue>
     : Represents a collection of keys and values stored within a Couchbase Document.
  • CouchbaseList<T>
     : Represents a collection of objects, stored in Couchbase server, that can be individually accessed by index.
  • CouchbaseQueue<T>
     : Provides a persistent Couchbase data structure with FIFO behavior.
  • CouchbaseSet<T>
     : Provides a Couchbase persisted set, which is a collection of objects with no duplicates.

All of these classes are found in the

Couchbase.Collections
  namespace. Here is an example of using a
CouchbaseQueue<T>
 :

var queue = new CouchbaseQueue<Poco>(_bucket, "somekey");
queue.Enqueue(new Poco { Name = "pcoco1" });
queue.Enqueue(new Poco { Name = "pcoco2" });
queue.Enqueue(new Poco { Name = "pcoco3" });

var item = queue.Dequeue();
Assert.AreEqual("pcoco1", item.Name);

Multiplexing IO

The Couchbase SDK has used connection pooling in the past to allow high throughput and scale at the cost of latency and resource utilization. In Couchbase 2.2.4 we introduced a better IO model call Multiplexing IO or MUX-IO, which the client could be configured to use (the default was pooled connections).

In 2.4.0 we are making MUX-IO the default IO model and making connection pooling optional. What this means to you is that some connection pooling properties in your configuration may still be used SDK. For example:

  • PoolConfiguration.MaxSize
      is still used but should be relatively small values Рe.g. 5-10
  • PoolConfiguration.MinSize
     should be 0 or 1

To disable MUX-IO it’s simply a matter of setting the 

ClientConfiguration.UseConnectionPooling
to true (the default is false) to use connection pooling:

var clientConfig = new ClientConfiguration{
    UseConnectionPooling = false
 };
var cluster = new Cluster(clientConfig);
 
//open buckets and use the client

Streaming N1QL and Views

Streaming N1QL and Views are a performance optimization in certain cases where the amount of data retrieved is large. To understand why, let’s consider how non-streaming queries work:

  1. A request is dispatched to the server.
  2. The server does it’s processing and returns back the results as a stream after processing the entire response.
  3. The client buffers the entire stream and then de-serializes the stream into a collection of type “T”, where T is the POCO that each result is mapped to.
  4. The server returns back the list to the application within its
    IResult

What can go wrong here? Think about very large results and that memory resources are finite: eventually you will always encounter an

OutOfMemoryException
 ! There are other side effects as well related to Garbage Collection.

With streaming clients the process is as follows:

  1. A request is dispatched to the server
  2. The server does it’s processing and returns back the results as a stream as soon as the response headers are available.
  3. The client partially reads the headers and meta-data and then pauses until iteration occurs.
  4. When the application starts iterating over the
    IResult
     , each item is read one at a time without storing in an underlying collection.

The big benefit here is that the working set of memory will not grow as the collection grows and internally re-sized by .NET. Instead, you have a fixed working size of memory and GC can occur as soon as the read object is discarded.

To use streaming N1QL and views, all that you do is call the

UseStreaming()
 method and pass in true to stream:

var request = new QueryRequest("SELECT * FROM `travel-sample` LIMIT 100;").UseStreaming(true);
using (var result = _bucket.Query<dynamic>(request))
{
    Console.WriteLine(result);
}

Passing in false will mean that the entire response is buffered and processed before returning.

N1QL Query Cancellation

This feature allows long running N1QL queries to be canceled before they complete using task cancellation tokens. For example:

var cancellationTokenSource = new CancellationTokenSource(TimeSpan.FromMilliseconds(5));

var result = await _bucket.QueryAsync<dynamic>(queryRequest, cancellationTokenSource.Token);
//do something with the result

This commit was via a community contribution from Brant Burnett of CenteredgeSoftware.com!

Important TLS/SSL Note on Linux

There is one issue on Linux that you may come across if you are using SSL: a PlatformNotSupportedException will be thrown if you have a version of libcurl installed on the server < 7.30.0. The work-around is to simply upgrade your libcurl installation on Linux to something equal to or greater than 7.30.0. You can read more about this on the Jira ticket: NCBC-1296.

 

The post Introducing Couchbase .NET 2.4.0 – .NET Core GA appeared first on The Couchbase Blog.

Categories: Architecture, Database

Couchbase Server 4.6 and macOS Sierra

NorthScale Blog - Thu, 02/16/2017 - 18:59

I am pleased to announce that the latest version of Couchbase Server (4.6) is now compatible with macOS Sierra! From the Couchbase downloads page, choose either Couchbase Server 4.6 Enterprise Edition or Couchbase Server 4.6 Community Edition depending on your needs.

macOS Sierra

If you’re unfamiliar with some of the things that Couchbase can do, check out an article I wrote recently on some of the Couchbase Server basics called, Couchbase and the Document-Oriented NoSQL Database. For help using Couchbase, check out the Developer Portal.

The post Couchbase Server 4.6 and macOS Sierra appeared first on The Couchbase Blog.

Categories: Architecture, Database

Couchbase Server 4.6 Supports Windows 10 Anniversary Update

NorthScale Blog - Thu, 02/16/2017 - 15:27

Back in August 2016, when the Windows 10 Anniversary Update was rolling out, I blogged that Couchbase Server was not working correctly on it. That is no longer true!

Short version: Couchbase Server 4.6 now supports Windows 10 Anniversary Update. Go download and try it out today.

The longer story is that this issue was addressed in the 4.5.1 release. The fix was somewhat experimental, and the anniversary update was still in the process of being rolled out. So there were two releases of Couchbase Server 4.5.1 for Windows:

  • Normal windows release (works with Windows 10, Windows Server, etc but not Anniversary Update)
  • Windows 10 Anniversary Edition Developer Preview (DP) release

Furthermore, Couchbase Server 4.6 has had a Developer Preview release of its own for a while, and that release also works with the anniversary update.

But now everything is official.

  • Couchbase Server 4.6 has been released
  • Couchbase Server 4.6 officially supports Windows 10 Anniversary Update

Go download Couchbase Server 4.6 now.

Got questions? Got comments? Check out our documentation on the Couchbase Developer Portal, post a question on the Couchbase Forums, leave a comment here, or ping me on Twitter.

The post Couchbase Server 4.6 Supports Windows 10 Anniversary Update appeared first on The Couchbase Blog.

Categories: Architecture, Database

Announcing Couchbase Server 4.6 ‚Äď What‚Äôs New and Improved

NorthScale Blog - Thu, 02/16/2017 - 15:00

Couchbase delivers the Couchbase Data Platform that powers Web, Mobile, and IoT applications for digital businesses. With our newest release, Couchbase Server 4.6 provides the availability, scalability, performance, and security that enterprises require for their mission-critical applications.

What’s New and Improved Query

The new string, date, array, and JSON object functions that have been added to N1QL simplify data transformations and provide richer query expressions.Faster queries in N1QL are the result of several query engine performance enhancements across many types of operations, including joins and index scans.

Check out documentation for String functions,  Date functions, Array functions and Object functions.

Replication

Cross datacenter replication (XDCR) with timestamp-based conflict resolution makes it easier for applications to implement a Last Write Wins (LWW) document conflict management policy across multiple Couchbase clusters. The per-document timestamp combines the server logical and physical clocks together, forming a hybrid logical clock and timestamp, which enables easy identification of consistent document snapshots across distributed Couchbase clusters.

Check out documentation for Timestamp-based conflict resolution.

Security

Adding support for Pluggable Authentication Modules (PAM) simplifies centralized password and policy management across servers. It also enables use of existing password management services for a Couchbase cluster (for example, Linux /etc/shadow).The new server secret management feature provides improved enterprise security compliance and a more security-hardened Couchbase Server.

Check out documentation for Pluggable Authentication Modules and Secret management.

Tools [Developer Preview]

It is now easier than ever to move data in and out of Couchbase Server using the new flexible import and export tools. cbimport imports data from a CSV file or a JSON document. cbexport exports data as a JSON document.

Check out documentation for Cbimport and Cbexport. Data Access Adding direct support for lists, maps, sets, and queues in the sub-document API using the new data structure SDK feature, further simplifies application development. The new data structures work seamlessly with the same underlying data representation, allowing developers in N1QL, Java, .NET, and other languages to access the same data across different programming languages and interfaces.Adding .NET Core support enables Microsoft application developers to easily develop and integrate their applications with Couchbase Server. Check out documentation for Data structures and .NET Core blog. Search [Developer Preview 2]

Search adds support for MossStore, the new default kv store mechanism for full text indexes in FTS. MossStore is part of Moss (‚ÄúMemory-oriented sorted segments‚ÄĚ), a simple, fast, persistable, ordered key value collection implemented as a pure Golang library.You can now create custom index mappings using the document key to determine the type and with this enhancement, so it‚Äôs easier to support the common data modeling style in which the document type is indicated by a portion of the key. This release also lets you sort search results by any field in the document, as long as that field is also indexed. In the earlier releases, search results were always returned in order of descending relevance score.

Check out documentation for Index Type Mapping By Keys and Sorting Query Results.

Here are some resources to get you started –

The post Announcing Couchbase Server 4.6 – What’s New and Improved appeared first on The Couchbase Blog.

Categories: Architecture, Database

Stateful Containers using Portworx and Couchbase

NorthScale Blog - Thu, 02/16/2017 - 06:55

Portworx Logo Couchbase Logo

Containers are meant to be ephemeral and so scale pretty well for stateless applications. Stateful containers, such as Couchbase, need to be treated differently. Managing Persistence for Docker Containers provide a great overview of how to manage persistence for stateful containers.

This blog will explain how to use Docker Volume Plugins and Portworx to create a stateful container.

Why Portworx?

Portworx is an easy-to-deploy container data services that provide persistence, replication, snapshots, encryption, secure RBAC and much more. Some of the benefits are:

  1. Container granular volumes – Portworx can take multiple EBS volumes per host and aggregate the capacity and derive container granular virtual (soft) volumes per container.
  2. Cross Availability Zone HA – Portworx will protect the data, at block level, across multiple compute instances across availability zones. As replication controllers restart pods on different nodes, the data will still be highly available on those nodes.
  3. Support for enterprise data operations – Portworx implements container granular snapshots, class of service, tiering on top of the available physical volumes.
  4. Ease of deployment and provisioning – Portworx itself is deployed as a container and integrated with the orchestration tools. DevOps can programmatically provision container granular storage with any property such as size, class of service, encryption key etc.
Setup AWS EC2 Instance

Portworx runs only on Linux or CoreOS. Setup an Ubuntu instance on AWS EC2:

  1. Start Ubuntu 14.04 instance with m3.medium instance type. Make sure to add port 8091 to inbound security rules. This allows Couchbase Web Console to be accessible afterwards.
  2. Login to the EC2 instance using the command: ssh -i ~/.ssh/arun-cb-west1.pem ubuntu@<public-ip>
  3. Update the Ubuntu instance: sudo apt-get update
  4. Install Docker: curl -sSL https://get.docker.com/ | sh. More detailed instructions are available at Get Docker for Ubuntu.
  5. Enable non-root access for the docker command: sudo usermod -aG docker ubuntu
  6. Logout from the EC2 instance and log back in
Create AWS EBS Volume
  1. Create an EBS volume for 10GB using EC2 console as explained in docs.
  2. Get the instance id from the EC2 console. Attach this volume to EC2 instance using this instance id, use the default device name /dev/sdf.
    Portworx EC2 Create Volume
  3. Use lsblk command in EC2 instance to verify that the volume is attached to the instance:
    NAME    MAJ:MIN RM SIZE RO TYPE MOUNTPOINT
    xvda    202:0    0   8G  0 disk
    ‚ĒĒ‚ĒÄxvda1 202:1    0   8G  0 part /
    xvdb    202:16   0  30G  0 disk /mnt
    xvdf    202:80   0  10G  0 disk
Portworx Container
  1. Physical storage makeup of each node, all the provisioned volumes in the cluster as well as their container mappings is stored in an etcd cluster. Start an etcd cluster:
    docker run -v \
      /data/varlib/etcd \
      -p 4001:4001 \
      -d \
      portworx/etcd:latest
  2. By default root mounted volumes are not allowed to be shared. Enable this using the command:
    sudo mount --make-shared /

    This is explained more at Ubuntu Configuration and Shared Mounts.
  3. PX-Developer (px-dev) container on a server with Docker Engine turns that server into a scale-out storage node. PX-Enterprise, on the other hand, provides multi-cluster and multi-cloud support, where storage under management can be on-premise or in a public cloud like AWS.
    For this blog, we’ll start a px-dev container:
    docker run --restart=always --name px -d --net=host \
      --privileged=true                             \
      -v /run/docker/plugins:/run/docker/plugins    \
      -v /var/lib/osd:/var/lib/osd:shared           \
      -v /dev:/dev                                  \
      -v /etc/pwx:/etc/pwx                          \
      -v /opt/pwx/bin:/export_bin:shared            \
      -v /var/run/docker.sock:/var/run/docker.sock  \
      -v /var/cores:/var/cores                      \
      -v /usr/src:/usr/src                           \
      --ipc=host                                    \
      portworx/px-dev -daemon -k etcd://localhost:4001 -c cluster1 -s /dev/xvdf

    Complete details about this command are available at Run PX with Docker.
  4. Look for logs using docker container logs -f px and watch out for the following statements:
    time="2017-02-16T05:33:26Z" level=info msg="Initialize the scheduler client and the scheduler watch" 
    time="2017-02-16T05:33:26Z" level=info msg="Started a kvdb watch on key : scheduler/containers" 
    time="2017-02-16T05:33:26Z" level=info msg="Started a kvdb watch on key : scheduler/volumes" 
    time="2017-02-16T05:33:26Z" level=info msg="Started a kvdb watch on key : scheduler/nodes/list"
  5. Check the status of attached volumes that are available to Portworx using sudo /opt/pwx/bin/pxctl status to see the output:
    Status: PX is operational
    Node ID: 679b79b1-f4c3-413e-a8e0-c527348647c9
    	IP: 172.31.25.21 
     	Local Storage Pool: 1 pool
    	Pool	IO_Priority	Size	Used	Status	Zone	Region
    	0	LOW		10 GiB	266 MiB	Online	a	us-west-1
    	Local Storage Devices: 1 device
    	Device	Path		Media Type		Size		Last-Scan
    	0:1	/dev/xvdf	STORAGE_MEDIUM_SSD	10 GiB		16 Feb 17 05:33 UTC
    	total			-			10 GiB
    Cluster Summary
    	Cluster ID: cluster1
    	Node IP: 172.31.25.21 - Capacity: 266 MiB/10 GiB Online (This node)
    Global Storage Pool
    	Total Used    	:  266 MiB
    	Total Capacity	:  10 GiB

    It shows the total capacity available and used.
Docker Volume
  1. Let’s create a Docker volume:
    docker volume create -d pxd -o size=10G -o fs=ext4 --name cbvol

    More details about this command are at Create Volumes with Docker.
  2. Check the list of volumes available using docker volume ls command:
    DRIVER              VOLUME NAME
    local               70f7b9a356df4c1f0c08e13a4e813f1ef3e174a91001f277a63b62d683a27159
    pxd                 cbvol
    local               f7bc5fa455a88638c106881f1bce98244b670e094d5fdc47917b53a88e46c073

    As shown, cbvol is created with pxd driver.
Couchbase with Portworx Volume
  1. Create a Couchbase container using the Portworx volume:
    docker container run \
      -d \
      --name db \
      -v cbvol:/opt/couchbase/var \
      -p 8091-8094:8091-8094 \
      -p 11210:11210 \
      arungupta/couchbase

    Notice how /opt/couchbase/var where all Couchbase data is stored in the container is mapped to the cbvol volume on the host. This volume is mapped by Portworx.
  2. Login to Couchbase Web Console at http://<public-ip>:8091, use the login Administrator and password as password.
  3. Go to Data Buckets and create a new data bucket pwx:
    Couchbase Bucket with Portworx
  4. In EC2 instance, see the list of containers:
    ubuntu@ip-172-31-25-21:~$ docker container ls
    CONTAINER ID        IMAGE                  COMMAND                  CREATED             STATUS              PORTS                                                                                               NAMES
    8ae763d9d53b        arungupta/couchbase    "/entrypoint.sh /o..."   5 minutes ago       Up 5 minutes        0.0.0.0:8091-8094->8091-8094/tcp, 11207/tcp, 11211/tcp, 0.0.0.0:11210->11210/tcp, 18091-18093/tcp   db
    5423bcd9b426        portworx/px-dev        "/docker-entry-poi..."   14 minutes ago      Up 14 minutes                                                                                                           px
    cf3c779a4459        portworx/etcd:latest   "/entrypoint.sh /b..."   21 minutes ago      Up 21 minutes       2379-2380/tcp, 7001/tcp, 0.0.0.0:4001->4001/tcp                                                     youthful_jepsen

    etcd, px-dev and db containers are running.
  5. Kill the db container:
    docker container rm -f db
  6. Restart the database container as:
    docker container run \
      -d \
      --name db \
      -v cbvol:/opt/couchbase/var \
      -p 8091-8094:8091-8094 \
      -p 11210:11210 \
      arungupta/couchbase

    Now, because cbvol is mapped to /opt/couchbase/var again, the data is preserved across restarts. This can be verified by accessing the Couchbase Web Consoleand checking on the pwx bucket created earlier.

Another interesting perspective is also at why database are not for containers?. Just because there is Docker, does not mean all your database needs should be Dockerized. But if you need to, then there are plenty of options and can be used in production-grade applications.

Want to learn more about running Couchbase in containers?

The post Stateful Containers using Portworx and Couchbase appeared first on The Couchbase Blog.

Categories: Architecture, Database

Couchbase DCP Rollback and How QA Tests Them

NorthScale Blog - Wed, 02/15/2017 - 20:41
Introduction

 

In this post I will describe how Couchbase QA team goes to great lengths to test Couchbase products, using DCP Rollbacks as an example, and first giving background on what DCP is and what a DCP rollback is.

 

What is DCP?

 

DCP stands for Database Change Protocol between Couchbase nodes such are replicas or views, or to external clients such as XDCR or FTS. It follows a producer/consumer model and provide high throughput and low latency. Data is streamed per vbucket. It is described here and there is a deep dive video here.

 

Of particular interest in the sequence number. The sequence number is a monotonically increasing number associated with mutations on a vbucket and is used to keep the producer and consumer in sync.

 

What is a rollback?

 

No, not these rollbacks

A Couchbase rollback occurs when a client connects to a producer with a sequence number greater than what the producer has, or to put it another way the client has newer mutations but as these are not the ‚Äútruth‚ÄĚ it must rollback or undo some mutations to align with the new truth.

 

This is not usual but does happen and for the purposes of data integrity it is important that it is handled correctly.

 

In a live environment this could happen in the following scenarios:

 

Failover with mutations in flight (this can occur with a hard failover)
  1. Client is connected to active node A
  2. Mutations with sequence number 100 arrives at client and is in flight to replica node B
  3. Before mutation with sequence number 100 arrives at node B there is a failover, node B has only sequence number 90
  4. The client, recognizing the failover connects to the newly active node B with sequence number 100
  5. Node B only having sequence number 90, responds to the client asking for a rollback to sequence 90
  6. The client undoes the effects of the mutations for sequence numbers 91-100, and requests a connection to the producer with sequence number 90.

 

Crash with non-persisted data

 

  1. Client is connected to active node A
  2. Mutations up to sequence number 90 have been persisted
  3. Mutations up to sequence number 100 arrives at client and is not yet persisted
  4. Memcached crashes and restarts, no failover
  5. Client reconnects to node A with sequence number 100
  6. Similar to steps 5 and 6 above

 

Testing for and with DCP Rollbacks (and how they break)

 

The Couchbase QA organization tests rollbacks in three ways:

 

Producer

 The producer requests rollbacks in the cases when the client requests a sequence number greater than that is known by the producer. We verify that producer really does request the client to rollback and it returns consistent data from that point.

  We have developed a custom DCP client. It performs mutations and parallel creates DCP connections and exercises specific scenarios. It can manipulate the requested sequence number, and trigger other feature interaction such as compaction, and monitor the persistence state

 

Consumer

When receiving a rollback request verify the client properly undoes later mutations and correctly applies the new mutations.

 

The following scenario will deterministically cause a client to get a rollback request:

 

  1. Stop persistence
  2. Kill memcached and memcached restarts
  3. DCP connections will be rollbacked to the sequence number prior to persistence being stopped

 

Test writers use the above scenario to trigger the rollback and verify that the client properly undoes the rolled back mutations.

Typical issues found on the consumer side deal with the consumer retaining information associated with the rollbacked mutations.

 

System test

 

Do hard failovers at high mutation rates (which causes active vbuckets to get ahead of the replica) during views and 2i.  As a result, there will be data loss while promoting replica vbuckets to active and rolling back sequence numbers, but the cluster should continue to be stable. Typical bugs found include the newly promoted replica having inconsistent data, and clients not handling rollback correctly.

 

The post Couchbase DCP Rollback and How QA Tests Them appeared first on The Couchbase Blog.

Categories: Architecture, Database

Using Couchbase JDBC with Tibco BusinessWorks

NorthScale Blog - Tue, 02/14/2017 - 18:14
Couchbase JDBC and Tibco ActiveMatrix Business Works 6.3 Summary

Establish rapid application workflows with Tibco ActiveMatrix Business Works by using Couchbase Server third-party drivers provided by SimbaTechnologies and CData.

    Table of Contents
  • Third Party Tools
  • Query
  • Data Integration
  • Tibco Business Works
  • Setup a Custom Driver

In recent years NoSQL has had a steep increase in use within many areas of industry and most notably within the enterprise. Many enterprise customers develop tools, use tools to develop application. They face a build or buy decisions throughout the lifecycle of an application. Because NoSQL is very developer friendly it is often at the forefront of data centric application development. This is especially true in the digital economy. NoSQL databases and document stores are used for development and often require integration with existing tools.

This creates a challenge for developers, managers, and executives when planning to adopt NoSQL systems. Primarily due to extra overhead of the learning curve for end-users who use tools to introspect data which is now in a new format. Couchbase has a very powerful set of standards built features which allows the re-use of existing knowledge thereby reducing the learning curve and ultimately the barrier to adoption.

Third Party Tools

Couchbase has several partners and there are two in particular who provide data integration technologies for existing applications in the form of ODBC and JDBC drivers. This allows off-the-shelf software and existing applications to connect to popular NoSQL systems like Couchbase.

Query

Accessing data in any database often requires some level of querying. Also a common, hopefully native, query language to execute those queries. NoSQL systems traditionally have been used for what is known as a Key/Value access pattern. Possibly, indexed query through the use of map-reduce.

Later advents, such as search technology, allowed new queries against NoSQL systems and are often  proprietary though SQL-like language. Eventually tools were employed, like Hadoop and and Spark integrations, to scale MapReduce queries. These access methods have challenges though. Lack of ad-hoc query, poor performance, no automatic joins, and missing features which requires additional application code. These are all reasons why the need for a query language exists, even when proprietary, for NoSQL data.

N1QLOne key feature of Couchbase Server is a complete ad-hoc query language called N1QL, pronounced Nickel. This language is a standards oriented purpose-built query language based on ANSI SQL92 standard. Because N1QL is standards based it allows customers to use Couchbase Server Query service and built-in query workbench to perform ad-hoc queries with complex logic, joins, subqueries and much more with their now rich JSON document data stored within Couchbase.

Managing data may be a task … but it doesn’t have to be a difficult one.

SQL: Developed in 1970s to deal with first wave of data storage applications

NoSQL: Developed in 2000s to deal with limitations of Relational Databases, particularly concerning scale, replication, developer agility, and unstructured data storage

Data Integration

To handle the challenges NoSQL might present to developers Couchbase has many partners. One set of partners, in particular, enables end-users to easily access NoSQL data with tools such as Excel, Tibco, Informatica, Tableau, or anything that can use ODBC or JDBC.

Simba Technologies and CData.com provide ODBC and JDBC drivers for Couchbase Server query service to allow applications to use standard SQL queries with Couchbase server. Using these drivers you can enable popular applications like Tibco BusinessWorks (BW) to use data stored within Couchbase natively.

Tibco Business Works

This document was written using Tibco BusinessWorks 6.3 on Mac OS X El Capitan but the configuration described will work on Windows systems as well.

The steps used to configure the environment are:

  1. Install Tibco ActiveMatrix BW 6.3 with the java installer (Windows, Linux or Mac)
  2. Complete any necessary BW 6.3 configuration steps
  3. Download a Couchbase Server JDBC driver
  4. Install the JDBC driver with administrative privileges
  5. Record the installation directory for later use
  6. Configure a Tibco Data source as Custom/JDBC

The following section will discuss how to setup a ‚ÄúCustom JDBC‚ÄĚ connection in Tibco BusinessWorks with Couchbase JDBC Drivers.

Setup a Custom Driver

Start TIbco BW 6.3 and then right click under the project explorer and choose ‚ÄúNew->Application Module‚ÄĚ and step through the interfaces until a default package is listed, like below:

To add a custom JDBC, right-click the Resources object and choose ‚ÄúJDBC Connection‚ÄĚ to open the JDBC configuration screen and change the Resource Name as desired.

On the following screen you will need to configure the following options:

Once configured the explorer menu should look like to this:

As shown in the screenshot above there is a process resource listed as ‚Äúprocess.bwp‚ÄĚ and this is used to¬†define the application workflow. To test queries, double-click on the process object and then drag-n-drop¬†a JDBC->JDBCQuery object from the menu on the right:

Click on the JDBCQuery object and enter a query to test. For example

select count(*),type from `beer-sample` group by type;

This will yield a result similar to the following table.

Total type 1412 brewery 5891 beer

Enter the query in the “Statement” section as shown in the dialog like the diagram below:

Upon executing the query results appear in the ‚ÄúSQL Results‚ÄĚ tab:

You can test the same way with the other Couchbase JDBC drivers as well. For more information please see the partner web sites for additional documentation about the capabilities of each driver and compatibility with your application.

 

The post Using Couchbase JDBC with Tibco BusinessWorks appeared first on The Couchbase Blog.

Categories: Architecture, Database

Moving from SQL Server to Couchbase Part 2: Data Migration

NorthScale Blog - Tue, 02/14/2017 - 16:50

In this series of blog posts, I’m going to lay out the considerations when moving to a document database when you have a relational background. Specifically, Microsoft SQL Server as compared to Couchbase Server.

In three parts, I’m going to cover:

  • Data modeling
  • The data itself (this blog post)
  • Applications using the data

The goal is to lay down some general guidelines that you can apply to your application planning and design.

If you would like to follow along, I’ve created an application that demonstrates Couchbase and SQL Server side-by-side. Get the source code from GitHub, and make sure to download a developer preview of Couchbase Server.

Data Types in JSON vs SQL

Couchbase (and many other document databases) use JSON objects for data. JSON is a powerful, human readable format to store data. When comparing to data types in relational tables, there are some similarities, and there are some important differences.

All JSON data is made up of 6 types: string, number, boolean, array, object, and null. There are a lot of data types available in SQL Server. Let‚Äôs start with a table that is a kind of “literal” translation, and work from there.

SQL Server JSON

nvarchar, varchar, text

string

int, float, decimal, double

number

bit

boolean

null

null

XML/hierarchyid fields

array / object

It’s important to understand how JSON works. I’ve listed some high-level differences between JSON data types and SQL Server data types. Assuming you already understand SQL data types, you might want to spend some time learning more about JSON and JSON data types.

A string in SQL Server is often defined by a length. nvarchar(50) or nvarchar(MAX) for instance. In JSON, you don’t need to define a length. Just use a string.

A number in SQL Server varies widely based on what you are using it for. The number type in JSON is flexible, in that it can store integers, decimal, or floating point. In specialized circumstances, like if you need a specific precision or you need to store very large numbers, you may want to store a number as a string instead.

A boolean in JSON is true/false. In SQL Server, it’s roughly equivalent: a bit that represents true/false.

In JSON, any value can be null. In SQL Server, you set this on a field-by-field basis. If a field in SQL Server is not set to “nullable”, then it will be enforced. In a JSON document, there is no such enforcement.

JSON has no date data type. Often dates are stored as UNIX timestamps, but you could also use string representations or other formats for dates. The N1QL query language has a variety of date functions available, so if you want to use N1QL on dates, you can use those functions to plan your date storage accordingly.

In SQL Server, there is a geography data type. In Couchbase, the GeoJSON format is supported.

There are some other specialized data types in SQL Server, including hierarchyid, and xml. Typically, these would be unrolled in JSON objects and/or referenced by key (as explored in part 1 of this blog series on data modeling). You can still store XML/JSON within a string if you want, but if you do, then you can’t use the full power of N1QL on those fields.

Migrating and translating data

Depending on your organization and your team, you may have to bring in people from multiple roles to ensure a successful migration. If you have a DBA, that DBA will have to know how to run and manage Couchbase just as well as SQL Server. If you are DevOps, or have a DevOps team, it’s important to involve them early on, so that they are aware of what you’re doing and can help you coordinate your efforts. Moving to a document database does not mean that you no longer need DBAs or Ops or DevOps to be involved. These roles should also be involved when doing data modeling, if possible, so that they can provide input and understand what is going on.

After you’ve designed your model with part 1 on data modeling, you can start moving data over to Couchbase.

For a naive migration (1 row to 1 document), you can write a very simple program to loop through the tables, columns, and values of a relational database and spit out corresponding documents. A tool like Dapper would handle all the data type translations within C# and feed them into the Couchbase .NET SDK.

Completely flat data is relatively uncommon, however, so for more complex models, you will probably need to write code to migrate from the old relational model to the new document model.

Here are some things you want to keep in mind when writing migration code (of any kind, but especially relational-to-nonrelational):

  • Give yourself plenty of time in planning. While migrating, you may discover that you need to rethink your model. You will need to test and make adjustments, and it‚Äôs better to have extra time than make mistakes while hurrying. Migrating data is an iterative cycle: migrate a table, see if that works, adjust, and keep iterating. You may have to go through this cycle many times.
  • Test your migration using real data. Data can be full of surprises. You may think that NVARCHAR field only ever contains string representations of numbers, but maybe there are some abnormal rows that contain words. Use a copy of the real data to test and verify your migration.
  • Be prepared to run the migration multiple times. Have a plan to cleanup a failed migration and start over. This might be a simple DELETE FROM bucket in N1QL, or it could be a more nuanaced and targeted series of cleanups. If you plan from the start, this will be easier. Automate your migration, so this is less painful.
  • ETL or ELT? Extract-Transform-Load, or Extract-Load-Transform. When are you going to do a transform? When putting data into Couchbase, the flexibility of JSON allows you to transfer-in-place after loading if you choose.
An example ETL migration

I wrote a very simple migration console app using C#, Entity Framework, and the Couchbase .NET SDK. It migrates both the shopping cart and the social media examples from the previous blog post. The full source code is available on GitHub.

This app is going to do the transformation, so this is an ETL approach. This approach uses Entity Framework to map relational tables to C# classes, which are then inserted into documents. The data model for Couchbase can be better represented by C# classes than by relational tables (as demonstrated in the previous blog post), so this approach has lower friction.

I‚Äôm going to to use C# to write a migration program, but the automation is what‚Äôs important, not the specific tool. This is going to be essentially “throwaway” code after the migration is complete. My C# approach doesn‚Äôt do any sort of batching, and is probably not well-suited to extremely large amounts of data, so it might be a good idea to use a tool like Talend and/or an ELT approach for very large scale/Enterprise data.

I created a ShoppingCartMigrator class and a SocialMediaMigrator class. I’m only going to cover the shopping cart in this post. I pass it a Couchbase bucket and the Entity Framework context that I used in the last blog post. (You could instead pass an NHibernate session or a plain DbConnection here, depending on your preference).

public class ShoppingCartMigrator
{
    readonly IBucket _bucket;
    readonly SqlToCbContext _context;

    public ShoppingCartMigrator(IBucket bucket, SqlToCbContext context)
    {
        _bucket = bucket;
        _context = context;
    }
}

With those objects in place, I created a Go method to perform the migration, and a Cleanup method to delete any documents created in the migration, should I choose to.

For the Go method, I let Entity Framework do the hard work of the joins, and loop through every shopping cart.

public bool Go()
{
    var carts = _context.ShoppingCarts
        .Include(x =&gt; x.Items)
        .ToList();
    foreach (var cart in carts)
    {
        var cartDocument = new Document&lt;dynamic&gt;
        {
            Id = cart.Id.ToString(),
            Content = MapCart(cart)
        };
        var result = _bucket.Insert(cartDocument);
        if (!result.Success)
        {
            Console.WriteLine($"There was an error migrating Shopping Cart {cart.Id}");
            return false;
        }
        Console.WriteLine($"Successfully migrated Shopping Cart {cart.Id}");
    }
    return true;
}

I chose to abort the migration if there’s even one error. You may not want to do that. You may want to log to a file instead, and address all the records that cause errors at once.

For the cleanup, I elected to delete every document that has a type of “ShoppingCart”.

public void Cleanup()
{
    Console.WriteLine("Delete all shopping carts...");
    var result = _bucket.Query&lt;dynamic&gt;("DELETE FROM `sqltocb` WHERE type='ShoppingCart';");
    if (!result.Success)
    {
        Console.WriteLine($"{result.Exception?.Message}");
        Console.WriteLine($"{result.Message}");
    }
}

This is the simplest approach. A more complex approach could involve putting a temporary “fingerprint” marker field onto certain documents, and then deleting documents with a certain fingerprint in the cleanup. (E.g. DELETE FROM sqltocb WHERE fingerprint = '999cfbc3-186e-4219-ab5d-18ad130a9dc6'). Or vice versa: fingerprint the problematic data for later analysis and delete the rest. Just make sure to cleanup these temporary fields when the migration is completed successfully.

When you try this out yourself, you may want to run the console application twice, just to see the cleanup in action. The second attempt will result in errors because it will be attempting to create documents with duplicate keys.

What about the other features of SQL Server?

Not everything in SQL Server has a direct counterpart in Couchbase. In some cases, it won’t ever have a counterpart. In some cases, there will be a rough equivalent. Some features will arrive in the future, as Couchbase is under fast-paced, active, open-source development, and new features are being added when appropriate.

Also keep in mind that document databases and NoSQL databases often force business logic out of the database to a larger extent than relational databases. As nice as it would be if Couchbase Server had every feature under the sun, there are always tradeoffs. Some are technical in nature, some are product design decisions. Tradeoffs could be made to add relational-style features, but at some point in that journey, Couchbase stops being a fast, scalable database and starts being “just another” relational database. There is certainly a lot of convergence in both relational and non-relational databases, and a lot of change happening every year.

With that in mind, stay tuned for the final blog post in the series. This will cover the changes to application coding that come with using Couchbase, including:

  • SQL/N1QL
  • Stored Procedures
  • Service tiers
  • Triggers
  • Views
  • Serialization
  • Security
  • Concurrency
  • Autonumber
  • OR/Ms and ODMs
  • Transactions
Summary

This blog post compared and contrasted the data features available in Couchbase Server with SQL Server. If you are currently using SQL Server and are considering adding a document database to your project or starting a new project, I am here to help.

Check out the Couchbase developer portal for more details.

Please contact me at matthew.groves@couchbase.com, ask a question on the Couchbase Forums, or ping me on Twitter @mgroves.

The post Moving from SQL Server to Couchbase Part 2: Data Migration appeared first on The Couchbase Blog.

Categories: Architecture, Database

Start Couchbase Using Docker Compose

NorthScale Blog - Tue, 02/14/2017 - 02:41

Couchbase Forums¬†has a question Can’t use N1QL on¬†docker-compose. This blog will show how to run Couchbase using Docker Compose and run a N1QL query.

Docker ComposeN1QL

What is Docker Compose?

Docker Compose allows you to define your multi-container application with all of its dependencies in a single file, then spin your application up in a single command.

Docker Compose introduced v3 in Docker 1.13. How do you know what version of Docker are you running?

docker version command gives you that information:

Client:
 Version:      1.13.1
 API version:  1.26
 Go version:   go1.7.5
 Git commit:   092cba3
 Built:        Wed Feb  8 08:47:51 2017
 OS/Arch:      darwin/amd64

Server:
 Version:      1.13.1
 API version:  1.26 (minimum version 1.12)
 Go version:   go1.7.5
 Git commit:   092cba3
 Built:        Wed Feb  8 08:47:51 2017
 OS/Arch:      linux/amd64
 Experimental: true

Couchbase Docker Compose File

Now if you see this version of Docker, then you can use the following Compose file:

version: "3"
services:
  db:
    image: arungupta/couchbase
    deploy:
      replicas: 1
    ports:
      - 8091:8091
      - 8092:8092
      - 8093:8093
      - 8094:8094
      - 11210:11210

In this Compose file:

Couchbase can be started in a couple of ways using this Compose file.

Couchbase using Docker Compose on Single Docker Host

If you want to start Couchbase on a single host (such as provisioned by Docker for Mac or a single Docker Machine), then use the command:

docker-compose up -d

This will show the warning message but starts Couchbase server:

WARNING: Some services (db) use the 'deploy' key, which will be ignored. Compose does not support deploy configuration - use `docker stack deploy` to deploy to a swarm.
Creating couchbase_db_1

Check the status of started service using the command docker-compose ps:

Name                       Command                       State                        Ports            
-----------------------------------------------------------------------------------------------------------------
couchbase_db_1               /entrypoint.sh /opt/couchb   Up                           11207/tcp,                 
                             ...                                                       0.0.0.0:11210->11210/tcp,  
                                                                                       11211/tcp, 18091/tcp,      
                                                                                       18092/tcp, 18093/tcp,      
                                                                                       0.0.0.0:8091->8091/tcp,    
                                                                                       0.0.0.0:8092->8092/tcp,    
                                                                                       0.0.0.0:8093->8093/tcp,    
                                                                                       0.0.0.0:8094->8094/tcp

All the exposed ports are shown and Couchbase is accessible at http://localhost:8091. Use the credentials Administrator/password to access the web console.

Now you can create buckets and connect from CBQ and run N1QL queries. For example:

/Users/arungupta/tools/couchbase/Couchbase\ Server\ 4.5\ EE.app/Contents/Resources/couchbase-core/bin/cbq -u Administrator -p password --engine http://localhost:8093
 Connected to : http://localhost:8093/. Type Ctrl-D or \QUIT to exit.

 Path to history file for the shell : /Users/arungupta/.cbq_history 
cbq> select now_str();
{
    "requestID": "d28280ab-49a4-4254-9f00-06bd1d2b4695",
    "signature": {
        "$1": "string"
    },
    "results": [
        {
            "$1": "2017-02-13T21:36:57.248Z"
        }
    ],
    "status": "success",
    "metrics": {
        "elapsedTime": "2.916653ms",
        "executionTime": "2.829056ms",
        "resultCount": 1,
        "resultSize": 56
    }
}
cbq> select version();
{
    "requestID": "51091fa6-dcc5-40f6-9c2b-1eb6732630bb",
    "signature": {
        "$1": "string"
    },
    "results": [
        {
            "$1": "1.6.0"
        }
    ],
    "status": "success",
    "metrics": {
        "elapsedTime": "4.599365ms",
        "executionTime": "4.525552ms",
        "resultCount": 1,
        "resultSize": 37
    }
}

Typically, you may be able to scale the services started by Docker Compose using docker-compose scale command. But this will not be possible in our case as the ports are exposed. Scaling a service will cause port conflict.

The container can be brought down using the command docker-compose down.

Couchbase using Docker Compose on Multi-host Swarm-mode Cluster

Docker allows multiple hosts to be configured in a cluster using Swarm-mode. This can be configured using the command docker swarm init.

Once the cluster is initialized, then the Compose file can be used to start the cluster:

docker deploy --compose-file=docker-compose.yml couchbase

It shows the output:

Creating network couchbase_default
Creating service couchbase_db

This creates a Docker service and the status can be seen using the command docker service ls:

ID            NAME          MODE        REPLICAS  IMAGE
0zls1k4mgrry  couchbase_db  replicated  1/1       arungupta/couchbase:latest

Check the tasks/containers running inside the service using the command docker service ps couchbase_db:

ID            NAME            IMAGE                       NODE  DESIRED STATE  CURRENT STATE        ERROR  PORTS
vf5zicu4mhei  couchbase_db.1  arungupta/couchbase:latest  moby  Running        Running 3 hours ago

Here again, you can connect to the Couchbase server and run N1QL queries:

/Users/arungupta/tools/couchbase/Couchbase\ Server\ 4.5\ EE.app/Contents/Resources/couchbase-core/bin/cbq -u Administrator -p password --engine http://localhost:8093
 Connected to : http://localhost:8093/. Type Ctrl-D or \QUIT to exit.

 Path to history file for the shell : /Users/arungupta/.cbq_history 
cbq> select version();
{
    "requestID": "12c5581e-44ee-4ea7-9017-6a017bb60a58",
    "signature": {
        "$1": "string"
    },
    "results": [
        {
            "$1": "1.6.0"
        }
    ],
    "status": "success",
    "metrics": {
        "elapsedTime": "3.725498ms",
        "executionTime": "3.678153ms",
        "resultCount": 1,
        "resultSize": 37
    }
}
cbq> select now_str();
{
    "requestID": "efe034fa-6d00-4327-9fc9-da8f6d15d95c",
    "signature": {
        "$1": "string"
    },
    "results": [
        {
            "$1": "2017-02-13T21:38:33.502Z"
        }
    ],
    "status": "success",
    "metrics": {
        "elapsedTime": "853.491¬Ķs",
        "executionTime": "800.154¬Ķs",
        "resultCount": 1,
        "resultSize": 56
    }
}

The service, and thus the container running in the service, can be terminated using the command docker service couchbase_db.

Any more questions? Catch us on Couchbase Forums.

You may also consider running Couchbase Cluster using Docker or read more about Deploying Docker Services to Swarm.

Want to learn more about running Couchbase in containers?

The post Start Couchbase Using Docker Compose appeared first on The Couchbase Blog.

Categories: Architecture, Database

Using Autonumber in Couchbase

NorthScale Blog - Thu, 02/09/2017 - 11:46

Ratnopam ChakrabartiRatnopam Chakrabarti is a software developer currently working for Ericsson Inc. He has been focused on IoT, machine-to-machine technologies, connected cars, and smart city domains for quite a while. He loves learning new technologies and putting them to work. When he’s not working, he enjoys spending time with his 3-year-old son.

Inserting documents with sequential keys (autonumber)

During the software development process we often come across a situation where we need to generate a unique key (of an entity) in an orderly sequential fashion (either increasing or decreasing order). Common examples include:

  1. Storing entries of a log file with an auto-generated sequence number assigned to each row of data

  2. Storing business entities in a database and having a primary key generated from an incremental sequence number

In the relational database world, this is achieved by making use of something known as ‚Äúdatabase sequence.‚ÄĚ The sequence is a feature provided by most database products which simply creates unique number values in an orderly manner. It just increments a value and returns it. Without a database sequence it is not very easy to generate unique numbers in a particular order. That‚Äôs why it is a popular choice when it comes to populating a primary key (or any unique key for that matter) with unique auto increasing values.

Other ways to generate randomly unique keys include using features such as GUID or UUID. However, there is no guarantee regarding the auto-increment nature that you get when using a database sequence generator.

No sequence in NoSQL databases

Unlike the relational database world, there is no straightforward built-in sequence generation feature available for most of the NoSQL databases in the market. One could argue that it’s not usual for a distributed system with free-form data to have a unique auto incrementing number serve as a unique key on a document because of the conflicting numbers generated in the case of cross-replication of data among different nodes and shards. Instead, implementing a UUID seems a much more viable option to guarantee uniqueness. However, if you need a randomly generated unique ID in a sequence then you somehow need to have an auto incrementing sequence column in a NoSQL database, because the UUID solution would not preserve the sequencing nature of the generated numbers. The main question: How would you handle that using Couchbase? Well, Couchbase has you covered and that’s what this post will describe.

Set up a Couchbase bucket to save the data

Let‚Äôs say we are storing items from a product catalog into Couchbase, and while storing the products in a Couchbase bucket we need to set a generated sequence in each of the product data JSON so that it can be used as a uniquely identifiable ‚Äúkey‚ÄĚ of a document.

To do so, follow the steps below to create a bucket named ‚Äúprodcat.‚ÄĚ

  • First, log into the administrative console of Couchbase.
  • Type http://localhost:8091/ui/index.html in your browser.
  • Login with admin username and password.
  • Navigate to the Data Buckets tab and click on Create New Data Bucket.

  • Enter ‚Äúprodcat‚ÄĚ in the field Bucket Name and 512 in Per Node RAM Quota field.

  • Leave all the other fields as default and click on ‚ÄúCreate.‚ÄĚ
  • Once the bucket is created successfully it will be listed with 0 items.

Using counter documents

Couchbase handles sequence generation by what is known as a ‚Äúcounter‚ÄĚ document. Counter is a document that can be incremented or decremented sequentially. An important thing to note here is that the increment or decrement operation of the counter is atomic. When we insert a business entity (such as Product in our case) as a JSON document, we can use the counter document with a key pattern to generate a sequence.

The following code snippet initializes a counter document with an initial value of 20.

// create a connection to couchbase cluster and bucket
Cluster cluster = CouchbaseCluster.create("127.0.0.1");
Bucket bucket = cluster.openBucket(BUCKET_NAME);

// Here, the BUCKET_NAME = “prodcat” (The one that was created using the Couchbase admin console)

String key = "idGeneratorForProducts";
try {
    bucket.remove(key);
} catch (DocumentDoesNotExistException e) {
    
}

try {
    bucket.counter(key, 0, 20);
} catch (DocumentDoesNotExistException e) {
    log.info("counter doesn't exist yet and no initial value was provided");
}

At this point, our counter document is initialized to a value of 20.

We run the following code in a loop to insert product data in a sequential fashion:

long nextIdNumber = bucket.counter(key, 1).content();
log.info("nextIdNumber = "+ nextIdNumber);
String id = "Prod::" + nextIdNumber;
//you're now ready to save your document:
   Product product = ProductUtil.getProduct(nextIdNumber);
   JsonObject content = JsonObject.create()
          .put("type", Product.TYPE)
          .put("id", product.getId())
          .put("description", product.getDescription())
          .put("price", product.getPrice());
         
bucket.insert(JsonDocument.create(id, content));

The following is an explanation of the above code:

The nextId is calculated by incrementing the counter by 1.

We make use of the nextId to populate the ‚Äúid‚ÄĚ field of the product document.

After completing the operation for three product documents, the prodcat bucket looks like this:

Where ‚ÄúidGeneratorForProducts‚ÄĚ is the counter document that holds the current value of the counter. Each product document has the ‚Äúid‚ÄĚ populated with the sequence:

{
  "id": "Product 21",
  "price": "20",
  "type": "Product",
  "description": "This is a utility product"
}

It’s worth mentioning that we can implement the sequence in decreasing order as well. In that case, the only thing we need to do is:

  • Initialize the counter to the maximum value.
  • Decrement the counter by 1 to generate the nextId in sequence.
  • Use the nextId to insert a document in a bucket.

Conclusion

The code used for this article is written in Java and makes use of the Spring Boot and Spring Data Couchbase dependencies. The same concepts can be applied to any Couchbase client SDK.

Source code for the application can be found at https://github.com/ratchakr/prodcat.

The post Using Autonumber in Couchbase appeared first on The Couchbase Blog.

Categories: Architecture, Database

Getting Started with Oracle Container Cloud Service

NorthScale Blog - Thu, 02/09/2017 - 03:10

Oracle Container Cloud Service is Oracle’s entry into the the world of¬†managed container service. There are plenty of existing options:Oracle Cloud Container Logo

This blog will explain how to get started with Oracle Container Cloud Service. A comparison of different managed services is started at Managed Container Service.

Before we jump into all the details, let’s try to clarify a¬†couple of things about¬†this offering from Oracle.

First, a bit about the name. “Oracle¬†Cloud Container Service” seems more natural and intuitive since its a Container Service in Oracle Cloud. Wonder why is it called “Oracle Container¬†Cloud Service”? Is it because “Oracle Container” is Oracle’s container orchestration framework and its a Cloud Service? Could that mean other orchestration frameworks be offered as a service as well?

Second, don’t confuse it with Oracle Application Container Cloud Service¬†that allows to build cloud-native 12-factor applications using polyglot platform. Now, that confuses me further. Can the Container Service not be used to build 12-factor apps? Are cloud-native and containers mutually exclusive?

Anyway, this is causing more confusion than clarification

Categories: Architecture, Database

Couchbase Developer Release ‚Äď Put some cURL in your N1QL

NorthScale Blog - Wed, 02/08/2017 - 13:50
What is cURL?

Ever heard of cURL? It’s a famous command line tool for sending/receiving data using URL syntax, says wikipedia here. Let’s start with an example related to Couchbase. The N1QL query service is available through a REST API. If you want to execute a N1QL query with cURL, and supposing your query service is enabled on localhost, it will look like this:

curl http://localhost:8093/query/service -d ‘statement=SELECT * FROM default LIMIT 1’

or if you want to run a fulltext query you would do something like

curl -XPOST -H “Content-Type: application/json” http://localhost:8094/api/index/beer/query -d ‘{“explain”: true,”fields”: [“*”],”highlight”: {},”query”: {“query”: “pale ale”}}’

Right now there is nothing specific about N1QL, it’s just an exemple of REST API call with cURL. So why this title? Well that’s because you can call cURL from a N1QL query.

cURL() in N1QL

curl() is a new N1QL function that allow you to access external JSON data over a remote(usually HTTP(s)) endpoint. Similar to other N1QL functions, this function can be used in various N1QL expressions and various clauses of the SELECT/DML queries (projection, where, from etc). When used in from clause the curl() invocation should result in a set/collection of JSON documents.

It has already been covered quite extensively on DZone, I strongly invite you to read it, as well as many other DZone posts about N1QL and written by the N1QL team.

We can start as anyone will do when trying a new language with a very simple SELECT:

SELECT CURL(“GET”,”https://maps.googleapis.com/maps/api/geocode/json”, {“data”:”address=santa+cruz&components=country:ES&key=YOUR_API_KEY”});

This will get you the following JSON result:

[
  {
    "results": [
      {
        "address_components": [
          {
            "long_name": "Santa Cruz de Tenerife",
            "short_name": "Santa Cruz de Tenerife",
            "types": [
              "locality",
              "political"
            ]
          },
          {
            "long_name": "Santa Cruz de Tenerife",
            "short_name": "TF",
            "types": [
              "administrative_area_level_2",
              "political"
            ]
          },
          {
            "long_name": "Canary Islands",
            "short_name": "CN",
            "types": [
              "administrative_area_level_1",
              "political"
            ]
          },
          {
            "long_name": "Spain",
            "short_name": "ES",
            "types": [
              "country",
              "political"
            ]
          }
        ],
        "formatted_address": "Santa Cruz de Tenerife, Spain",
        "geometry": {
          "bounds": {
            "northeast": {
              "lat": 28.487616,
              "lng": -16.2356646
            },
            "southwest": {
              "lat": 28.4280248,
              "lng": -16.3370045
            }
          },
          "location": {
            "lat": 28.4636296,
            "lng": -16.2518467
          },
          "location_type": "APPROXIMATE",
          "viewport": {
            "northeast": {
              "lat": 28.487616,
              "lng": -16.2356646
            },
            "southwest": {
              "lat": 28.4280248,
              "lng": -16.3370045
            }
          }
        },
        "place_id": "ChIJcUElzOzMQQwRLuV30nMUEUM",
        "types": [
          "locality",
          "political"
        ]
      }
    ]
  }
]

There is a lot of information here and maybe you are only interested in getting the coordinates of the identified address. It’s easy to do. You just treat the result of the cURL function like any other JSON object:

SELECT CURL(“GET”,”https://maps.googleapis.com/maps/api/geocode/json”, {“data”:”address=santa+cruz&components=country:ES&key=YOUR_API_KEY”}).results[0].geometry ;

And this will return just what you wanted:

[
  {
    "geometry": {
      "bounds": {
        "northeast": {
          "lat": 28.487616,
          "lng": -16.2356646
        },
        "southwest": {
          "lat": 28.4280248,
          "lng": -16.3370045
        }
      },
      "location": {
        "lat": 28.4636296,
        "lng": -16.2518467
      },
      "location_type": "APPROXIMATE",
      "viewport": {
        "northeast": {
          "lat": 28.487616,
          "lng": -16.2356646
        },
        "southwest": {
          "lat": 28.4280248,
          "lng": -16.3370045
        }
      }
    }
  }
]

While this is quite nice, there is currently no link with your data. Let’s pretend you have a document with an address but no geo coordinates. You can now add those coordinates with the following N1QL query:

UPDATEtravel-sampleUSE KEYS “myDocumentWithoutCoordinates” SET geo = CURL(“GET”,”https://maps.googleapis.com/maps/api/geocode/json”, {“data”:”address=santa+cruz&components=country:ES&key=YOUR_API_KEY”}).results[0].geometry returning *

And now you just modified an existing document based on an external service with a N1QL query

Categories: Architecture, Database