Wednesday, July 26, 2017

NoSQL in Azure: Cosmos DB

NoSQL technologies have been around for a while now; in the past I wrote about both MongoDB and Graph Databases.

Recently Microsoft introduced the Cosmos DB offer within its Azure Cloud, where Cosmos DB is not a Database, but instead a set of Common Data Services for NoSQL DBs in the Cloud (such as scalability, distribution, partitioning, etc), described as a “globally distributed database service designed to enable you to elastically and independently scale throughput and storage across any number of geographical regions with a comprehensive SLA. You can develop document, key/value, or graph databases with Cosmos DB using a series of popular APIs and programming models”.

Azure Cosmos DB currently supports the following NoSQL DBs:
  • DocumentDB
  • MongoDB
  • Table API
  • Graph API

The price unit in Cosmos DB is called Request Unit, which is defined as: 

A Request Unit (RU) is the measure of throughput in Azure Cosmos DB. 1 RU corresponds to the throughput of the GET of a 1KB item”.

There is a RU Calculator to estimate the cost of your Cosmos DB.

More information on Cosmos DB can be found here.

A nice detail is that all experimenting with Cosmos DB can be done locally with the Azure Cosmos DB Emulator, without having to spend any money on Azure (at least during the initial development).

MongoDB

MongoDB is probably the most mature NoSQL DB in the market. It has been used for years now, and offers flexibility of data storing, great performances (especially on Big Data).

MongoDB is a Document database which stores data in flexible, JSON-like (BSON) documents, meaning fields can vary from document and data structure can be changed over time.

MongoDB is now part of the Cosmos DB offer, which makes it easier to integrate it in a Microsoft environment, especially on Azure.

An example of a MongoDB document (a Row in a RDBMS) is here:
{
  "id": "WakefieldFamily",
  "parents": [
      { "familyName": "Wakefield", "givenName": "Robin" },
      { "familyName": "Miller", "givenName": "Ben" }
  ],
  "children": [
      {
        "familyName": "Merriam",
        "givenName": "Jesse",
        "gender": "female", "grade": 1,
        "pets": [
            { "givenName": "Goofy" },
            { "givenName": "Shadow" }
        ]
      },
      {
        "familyName": "Miller",
         "givenName": "Lisa",
         "gender": "female",
         "grade": 8 }
  ],
  "address": { "state": "NY", "county": "Manhattan", "city": "NY" },
  "creationDate": 1431620462,
  "isRegistered": false
}

The main advantage of using MongoDB within Cosmos DB is the better integration in terms of development and deployment; in fact, using MongoDB within Cosmos DB removes the need for a VM to host the MongoDB service, and Microsoft tools such as the Cosmos DB Emulator will make it easier to build MongoDB solutions that run both locally and within Azure in a matter of clicks.

A basic tutorial on MongoDB in C# is here.

DocumentDB

DocumentDB is the Microsoft Azure offer in alternative to MongoDB; a lot has been written about the comparison with MongoDB, and so it’s out of scope here.

The main advantage of using DocumentDB instead of MongoDB is the better integration with Microsoft tools and required development libraries, even though now that MongoDB is supported by Cosmos DB this gap gets shorter and shorter.

For the RDBMS fans out there, it’s worth mentioning that DocumentDB introduced a feature called Document DB API SQL, which allows standard SQL syntax to be used to query the DocumentDB NoSQL database (btw, see the contradiction?).

Documents (Rows of data) in DocumentDB are like those in MongoDB, except the Microsoft product works with plain JSON instead of BSON.

A basic tutorial on DocumentDB in C# is here.

Graph API

A graph is a structure that's composed of vertices and edges. Both vertices and edges can have an arbitrary number of properties. Vertices denote discrete objects such as a person, a place, or an event. Edges denote relationships between vertices. For example, a person might know another person, be involved in an event, and recently been at a location. Properties express information about the vertices and edges.

Graph Databases have been around for some time now (especially since Social Media companies such as Facebook and Twitter became popular). A notable example of a Graph Database is Neo4J.

Azure Cosmos DB offers a Graph API as the Azure Graph DB offer; the languages used to query Azure Cosmos DB are the ApacheTinkerPop graph traversal language, Gremlin, or other TinkerPop-compatible graph systems like ApacheSpark GraphX.

Again, the tooling integration for the Microsoft product is much better than its graph DB competitors; when it comes to the graphical representation of the graph data, something nicely supported by Neo4J out of the box, Microsoft offers an open source client application called Graph Explorer, which allow easy querying and displaying of the data.

An example of a graphical representation of a Graph dataset.

A Gremlin query to create a Vertex as:
g.addV('person');

A Vertex can have properties such as:
g.addV('person').property('id', 'thomas').property('firstName', 'Thomas').property('age', 44);

You can add an Edge such as “knows”, for each friend of Thomas:
g.V('thomas').addE('knows').to(g.V('ben'));

You can get all Vertex and Edges by running this query:
g.V(); g.E();

Then you can run a traversal query to show all the Friends of Thomas:
g.V('thomas').outE('knows').inV().hasLabel('person');

You can go as far as retrieving in a simple query all the Friends of Friends of Thomas:
g.V('thomas').outE('knows').inV().hasLabel('person').outE('knows').inV().hasLabel('person');

CATCH: Edges are one way directional, for example, “PersonA knows PersonB”, does not mean that “PersonB knows PersonA”, unless you add a second Edge to represent this.

For the usual RDBMS fans, you can read this introduction, and keep in mind that there is a nice document on how to “translate” SQL queries into Gremlin queries.

And here is the full Gremlin syntax documentation.

Table API



Azure Cosmos DB provides the Table API for applications that need a key-value store with flexible schema, predictable performance, global distribution, and high throughput. The Table API provides the same functionality as Azure Table storage, but leverages the benefits of the Azure Cosmos DB engine.

You can continue to use Azure Table storage for tables with high storage and lower throughput requirements. Azure Cosmos DB will introduce support for storage-optimized tables in a future update, and existing and new Azure Table storage accounts will be upgraded to Azure Cosmos DB.

More information about Table API can be found here.

Tuesday, July 11, 2017

Azure Apps Provisioning to External Users

So, you’re finally deploying your apps to Azure like there’s no tomorrow, through a solid CI and CD process, and everyone is happy about it.

Then you realize you still have one challenge: you need to provide those apps to your customers in a smooth but secure way, just like you’ve been doing for years with Active Directory Federation, where the customer logs onto his own AD, and from there he can access your apps.

How to achieve that in Azure?

Turns out that (after solving a few puzzles – we know MS documentation, don’t we) it is quite simple!

First you need to invite your customer to join your Azure Active Directory; this is done in the New Portal by opening the Azure Active Directory “menu blade”.




Then click on the menu item Users and groups.







Then click on the menu item All users.





Here you can invite an external user to join your AAD as a Guest; this will give your customer enough permission to use your Azure deployed apps (that you assign them permissions to), without being able to access your other Azure resources.

NOTE: It is also possible from the Classic Portal (only) to add an external user by creating a full user within your Active Directory, but that’s out of scope here.

Now click on the New guest user link.




Here you will simply write your customer email address, and optionally a personal message to be included with the invitation.

Once this is done, and email message will be sent to your customer inbox, looking like this.















Once your customer will click on this link he’ll get to the URL http://myapps.microsoft.com, where he will see an empty page!

:)


Yes, first you need to assign permissions to some app that you want your customer to be able to use.

This can be done in the Azure Portal, within the Azure Active Directory section, under Enterprise Applications. Select the Application you need to assign, then click on the Users and groups menu item.


From here you can click on the Add user button.


There you will be able to select the user(s) you want to assign to the App, and if any AppRole has been defined in the App Manifest, assign those as well.

Now your customer can access the same page again, but now he will see a list of allowed Apps.










And that’s it!






Step-by-Step Guide to Fine-Tune an AI Model

Estimated reading time: ~ 8 minutes. Key Takeaways Fine-tuning enhances the performance of pre-trained AI models for specific tasks. Both Te...