Tuesday, April 17, 2018

Azure @ Enterprise - Connecting to Management API using Azure AD App certificate

Introduction

When we develop large multi-tenant applications, we often face requirements to dynamically provision infrastructure resources, which we never need to do for simple applications. For example, assume the enterprise security practice or business requirement demand to isolate tenant's data into separate databases. Then we will have to dynamically create databases on the fly, when a tenant joins the application. This is drastically different than working on a simple application where only one database is storing everything. The maximum we have to deal with are the indexes and partitioning in database level and load balancing at front end level.

When we create or provision the resources from application, there will be so many security related questions to be answered. What if someone hack into the application and delete the databases? How to handle the noisy neighbor problem? The list is large.

To handle security, the on-premise solution is to have separate service accounts. Those have permission to create databases and isolate that service from web services exposed to clients machines. Expose that only internally either by authorization or by exposing via netNamedPipe binding in WCF.

Managing resources in Azure

Cloud computing is expected to solve all the infrastructure provisioning issues. Azure does that well. When enterprise meet Azure, all the security related things mentioned above gets reevaluated. In cloud it gets important otherwise an attack or poor code may create large number of highly priced resources which directly affect the financials. Or resources can be deleted which brings the entire system down. In on-premise systems there is very limited or no way an attack can delete a virtual machine. But in cloud its not.

How to secure a component which does infrastructure provisioning? This problem can be solved in Azure many ways. We can have a service which is secured using Azure AD and only be exposed inside enterprise's own virtual network(vNet) in Azure. But then the question comes how to secure the Azure AD? Azure AD supports different types authentications and enterprise like the MFA and certificate based auth. The latest in the series is Managed Service Identity.

MFA - Multi Factor Authentication helps secure something exposed to users who can look at the security token and enter the same in web page or device. But for service to service communication, a scheduled job or queued operation to service communication, MFA is not suitable. The certificates helps there.

Securing in the world of Microservices - Automation RunBook?

In a large enterprise, there could be so many applications which are multi-tenant and need infrastructure provisioning. Enterprise may have only one Azure subscription for all these. In such scenario giving the certificates which have the privilege to create Azure resources to all those apps will not be feasible or those apps cannot be run with that level of high privilege.

One solution in Azure is to use Azure Automation. Automation Runbook can run in high privilege and can create Azure resources. That can be exposed via Webhooks to the applications. Applications can invoke the Webhook with some kind of application identity or developer key in request header. Once the runbook starts it can check for the application key and do actions if allowed. Please note that Webhooks don't have security mechanism built in. URL has a secret token and whoever knows the URL can invoke. The runbook can check for header and validate.

Writing RunBooks is easy and there are lot of tutorials available how to get it right. 

But there is a problem remaining. Webhook return a JobId. How do the applications check status of the Job?

Callback?

We will end up again in certificates, if we need to use the Azure Management API. But it is easy to do status reporting if the automation runbook accept a callback URL and invoke that on Job completion.

Webhook accepting another Webhook on completion may make things complicated but that is good solution without polling.

Unfortunately if we end up in polling, below are the code snippets which can be used to get the Automation Job status using .Net SDK. There are so many code snippets available in internet but very difficult to get working code which uses cert to auth into Azure Management API. 

Since the authentication APIs accept strings and the names are confusing, its gets complicated easily.

Code snippets

Below is the entry point which accepts the necessary inputs to locate an Azure Automation Job

private static async Task<JobGetResponse> GetJobResponse( string subscriptionGuid,string resourceGroupName, string AutomationAccount, string JobId)
{
            AutomationManagementClient client = await AutomationManagementClientFactory.Get(subscriptionGuid);
            return client.Jobs.Get(resourceGroupName, AutomationAccount, Guid.Parse(JobId));
}

The return ed JobGetResponse has Job property which exposes most of the properties of Job.
In order to get the code working, we need a valid AutomationManagementClient. How to properly give the string values into the flow is the trickiest part. .

internal class AutomationManagementClientFactory
{
    internal static async Task<AutomationManagementClient> Get(string subscriptionGuid)
    {
        string token = await TokenFactory.GetAccessToken("https://management.core.windows.net/");
        TokenCloudCredentials tcc = new TokenCloudCredentials(subscriptionGuid,token);
        return new AutomationManagementClient(tcc);
    }
}

This depend on the TokenFactory. But before going there the catch here is on the hard coded URL. The URL is to the management end point. Lets see the TokenFactory class

internal class TokenFactory
{
    /// <summary>
    /// Get Access Token
    /// </summary>
    /// <param name="resource"></param>
    /// <returns></returns>
    internal static async Task<string> GetAccessToken(string resource)
    {
        var context = new AuthenticationContext($"https://login.windows.net/{Configurations.TenantId}", TokenCache.DefaultShared);
        var assertionCert = GetClientAssertionCertificate(Configurations.AzureADApplicationId);
        var result = await context.AcquireTokenAsync(resource,assertionCert );
        return result.AccessToken;
    }
    internal static IClientAssertionCertificate GetClientAssertionCertificate(string clientId)
    {
        string certIssuerName = Configurations.GetConfigValueByKey("CertificateIssuerName");
        X509Certificate2 clientAssertionCertPfx = CertificateHelper.FindCertificateByIssuerName(certIssuerName);
        return new ClientAssertionCertificate(clientId, clientAssertionCertPfx);
    }
}

The responsibility of the class is to get authentication token towards a resource and the resource here is the Azure Management end point. The authentication context using the Azure AD tenant Guid to get token. The TenantId is not the Azure AD Application id. 

It uses certificate which is found using the issuer name. The criteria to find certificate can be anything but the rule here is that the certificate should be same as the certificate used for Azure AD Application. The Azure Application's id is obtained from the config. It has to be Application Id not the object id of Azure AD App.

The signature may confuse us. The client assertion certificate uses application id, but the parameter name is client id to make it generic.

The last thing is the certificate helper. As mentioned above, how we get the cert is not relevant as long as its the right certificate. Adding the code for that as well.

public static class CertificateHelper
{
    /// <summary>
    /// Find Certificate By Issuer name
    /// </summary>
    /// <param name="findValue"></param>
    /// <returns></returns>
    public static X509Certificate2 FindCertificateByIssuerName(string findValue)
    {
        using (X509Store store = new X509Store(StoreName.My, StoreLocation.CurrentUser))
        {
            store.Open(OpenFlags.ReadOnly);
            X509Certificate2Collection col = store.Certificates.Find(X509FindType.FindByIssuerName,
                findValue, false); // Don't validate certs, since the test root isn't installed.

            return col.Count == 0 ? throw new CryptographicException($"Certifcate not found") : col[0];

        }
    }
}


Prerequisites / Environment setup

  • Azure AD Application which has permission to resource group where the automation account reside.
  • The above Azure AD application to accept certificate to get token back. So install the certificate to the proper store. In this case it search in Current User's personal store. If this code runs from IIS web application using service accounts, the store can be different.

Why the code snippet is important?

When we get the snippet working and look at the code we feel its simple. But when we get a situation when this doesn't work, we cannot understand anything such as what is client id, what is resource id etc...

The hard coded strings are applicable in public Azure cloud. When the code runs in Azure Government or other tenants the values will differ.

Exceptions

Below are some exceptions which may occur during the development

Access token from wrong audience

The below exception may occur if the token is obtained from TokenFactory is not associated with the right resource. 

"The access token has been obtained from wrong audience or resource ’https://management.core.windows.net'. It should exactly match (including forward slash) with one of the allowed audiences ‘https://management.core.windows.net/’,’https://management.azure.com/’"

Enjoy...

Tuesday, April 10, 2018

Azure @ Enterprise - Tuning the HDIClusters programatically

HDInsight Cluster

HDInsight shortly referred to as HDI is the Microsoft wrapper around Hadoop and other open source data analytics technologies such as Spark. It depends on the Harton works platform. It can be installed onpremise and available in Azure as well, in the form of platform service. 

In Azure, the advantage is that the scaling can be easily done though it takes around 15 mins. We can create a cluster for specific workloads and delete after it is done. This help us to save lot of money as its costly during running time.

HDInsight @ Enterprise

At enterprise the workloads differ and there could be different application teams wants to use HDICluster for their various workloads. Either all can write the code to create HDICluster using the Azure management APIs in every application or there could be a common service which can be used to serve the applications. When we have common service and different application workloads have different cluster demands, we need to adjust the cluster properties. 

Setting the cluster properties is really complex since the properties are spread across in different levels. There are properties at cluster level such as no of worker nodes, node manager level, Livy job submission level, worker JVM properties etc... Getting these properties under control is a big challenge.

Sometimes we may need to reuse the clusters before deleting it to save time of cluster creation. At the time of writing this post, it takes around 15-20 mins to get a new cluster created. If the common service can give the clusters to subsequent consumers, it would save a good amount of time.

Manually

Manually we can easily adjust the properties from Azure portal and the Ambari views of specific cluster.  Some links are given below.

https://docs.microsoft.com/en-us/azure/hdinsight/hdinsight-hadoop-manage-ambari
https://docs.microsoft.com/en-us/azure/hdinsight/hdinsight-changing-configs-via-ambari
https://docs.microsoft.com/en-us/azure/hdinsight/spark/apache-spark-resource-manager
https://docs.microsoft.com/en-us/azure/hdinsight/hdinsight-administer-use-portal-linux

After setting some properties the cluster needs restart. The portal shows whether to restart or not based on the property what is changed.

API

It is easy to adjust the properties at the cluster level using Azure APIs. But when it comes to the properties inside cluster such as the Node Manager heap size etc...we have to rely on the Ambari API. Below are some links to do the same.

https://docs.microsoft.com/en-us/azure/hdinsight/hdinsight-hadoop-manage-ambari-rest-api#example-update-configuration
https://community.hortonworks.com/content/supportkb/49134/how-to-stop-start-a-ambari-service-component-using.html

Using these APIs is the toughest thing in the API world. We have to get the current settings and do the changes to that and send back with a new label. Something similar to how we do in the coding. Get latest, do change and commit the change set.

If the jobs are submitted using Livy, there is option for sending some parameters which are at that job level. Examples of those parameters are the executor_core.

https://docs.microsoft.com/en-us/azure/hdinsight/spark/apache-troubleshoot-spark#how-do-i-configure-a-spark-application-by-using-livy-on-clusters

Handle restarts

As mentioned earlier some properties require the cluster to restart. The UI shows a warning to restart. What to do when we use the API? The best answer is to restart the service after setting the properties regardless whether restart needed or not :)

https://community.hortonworks.com/questions/50823/ambari-rest-api-to-restart-all-services.html
https://community.hortonworks.com/questions/123749/api-restart-the-services-that-restart-is-required.html
https://stackoverflow.com/questions/45739985/api-restart-the-services-that-restart-is-required

Since the usage is pretty much straightforward, not including any code snippets. But if anyone facing issues with these APIs, please comment in this post.

Tuesday, April 3, 2018

IoT - NodeMCU

Introduction

Nobody now a days need introduction to IoT. To me its more buzzword where all the devices or things gets connected to internet. Regardless 'IoT' term is there or not, devices will get connected to internet if they can talk over wifi, phone networks or any other connection mechanism

Server side

IoT and Cloud seems getting coupled together with specialized IoT offerings from cloud providers. As far as we can host an http end point in internet with capability to handle load, device has connectivity to there, we are good to deal with most of use cases. Really don't need to follow the buzzwords of cloud providers. But there are so many marketing sessions happened and happening on IoT by just demoing the server side. Nothing at all about device side.

Device / Thing

Device is any Turing machine with internet connectivity and required sensors and actuators. The decision making is done by the Turing machine which is nothing but a computer CPU. Generally people use small boards which have these capabilities. Arduino, Raspberry Pi are famous in this area. If we use an old PC, laptop or mobile connected to required sensors and actuators, its same as the Thing. It doesn't necessarily be the small device always.

But the small device has advantage over PCs laptops and mobiles where the display, keyboard are optional things for the so called Thing. Basically the boards are modular. If we need a moisture sensor we can plug that in else just don't use that. That reduce the cost drastically when we use the boards. The board ensures that there is computing power from processor and I/O ports to communicate with the world.

NodeMCU

This is one of the board similar to Arduino and Raspberry Pi. More powerful than Arduino but below 
Raspberry Pi. The boards are available from 2.5 USD and up. NodeMCU currently is in Version 3 which indicate there were 2 version earlier and people might have used those and reported issues and perhaps fixed. So it is better to buy and try one now.

Why NodeMCU

The simple answer is its real out of the box WiFi enabled Thing. Other boards may need additional purchases to connect to internet. Second is that it has more specs that Arduino in the same price range.

This post is mainly about experience with one variant of NodeMCU V3 board named LoLin. Below is AliExpress link
https://www.aliexpress.com/item/new-Wireless-module-CH340-NodeMcu-V3-Lua-WIFI-Internet-of-Things-development-board-based-ESP8266/32556303666.html

Driver

We have to install the proper USB driver in order to connect from computer to the board to push the code. Have to google CH340G driver. Fortunately for Win10 it didn't had that issue. I am not sure whether I might have installed as part of any other experiment earlier.

Pin layout

If we just google for NodeMCU quick start tutorials we can get steps which will take us towards blinking the LED in the board. When I got the board, I tried the same but the code in those tutorials never worked to blink the LED. 

First I was in impression that they shipped a faulty one. But the rating of the listing made me rethink. how come that listing can get 108 five star ratings if they ship faulty products. Started deep googling. Deep googling seems a new term when the normal google doesn't help. That require one seperate post. 

It landed me into another world of incompatibilities among the NodeMCU boards. Below is one post discussing about the LED built into the board and the pin number to activate LED. The interesting thing is that we need to output 0 to turn the LED and vice versa.

https://arduino.stackexchange.com/questions/38477/does-the-node-mcu-v3-lolin-not-have-a-builtin-led

Connecting to WiFi

Another tutorial on connecting to the WiFi. This is yet to be tested using my board
http://henrysbench.capnfatz.com/henrys-bench/arduino-projects-tips-and-more/arduino-esp8266-lolin-nodemcu-getting-started/

I am planning to use the board to automatically water our curry leaves plant from Aquarium by sensing moisture in the soil. Hopefully more updates will be coming soon.

References

https://frightanic.com/iot/comparison-of-esp8266-nodemcu-development-boards/

Tuesday, March 27, 2018

Azure @ Enterprise - Functions v/s AWS Lambda

Serverless @ enterprise

Serverless is a buzzword in the software industry right now after Microservices. As with any other technology revolution, startups will jump into it and Enterprise will only be entering to Serverless slowly. Earlier the technologies would be Enterprise ready from the starting itself. But now a days it seems Enterprise support is little behind at the technology edge. 

This post is an attempt to examine the Serverless offering is ready for enterprise. Since the series is Azure @ Enterprise, lets see from the Azure side.

Cost

If we consider cost, that was not a big factor for Enterprise. They usually buy big boxes to meet their peak load and enjoy during the normal load and struggle during the heavy load. Whatever capacity planning we do there are chances that the real load differ because of  unknowns.

But with the Cloud adoption, enterprise started thinking about cost. If they had embraced 2 pizza team strategy or Microservice architecture style or devOps teams there are high chances that small teams started maintaining services / applications end to end. In order to prove their success they are forced to reduce cost in the operations. Ultimately cost becomes a factor in Enterprise.

This think enterprise to adopt Serverless or true pay per use cloud offerings. At present in Azure, Functions are the real pay per use service. Hence the title limited to Functions. Cost wise, Functions has got good score.

Versioning

The versioning might be a concern if the enterprise is using blue green deployment where the applications move to production and there might be some rollbacks required.
In Azure Functions there is no versioning concept except pulling from the source control repo. But in Amazon lambda which is the Function equivalent in Amazon cloud, there is versioning. We can have smooth version control of Functions in even in the production environment.

Security

Next factor is securing the Enterprise. There will be less chance for the Enterprise to compromise on security because something is cheap. If we take that thought to Functions, there are 2 challenges with Functions right now.

Securing external internet facing Functions

This can be done via API Gateways. There is not much challenge in that once the suitable gateway is selected. It ensure there is no DDoS, brute force, injections etc...

Securing internal Functions

There could be lot of internal services which are supposed to be exposed inside Enterprise. Normally on premise network would be protected using appropriate measures. The end points might not be visible to outside. This is the area Functions lag.
Suppose we need to have an internal Function there are 2 options. One is to setup the firewall rules. Second to host inside vNet or virtual network. Then the Functions end point will not be accessible outside.
The best way is to use vNet but the problem here is pay per use Functions don't support hosting inside vNet unless they are under AppServiceEnvironment (ASE). AppServiceEnvironment provides an isolated environment for the Enterprise to host things. But the problem with ASE is that it is highly costly. around $1200/month. The real problem is that the cost is fixed regardless of the usage. We lost everything we talked about Serverless if it is fixed billing.

Amazon Lambda

Amazon have VPC instead of vNet. It doesn't seems there is a fixed high cost if we wanted to host Lambda inside the VPC. Please note the information is obtained from google and feel free to correct. 

Some links below on Lambda and VPC

https://aws.amazon.com/premiumsupport/knowledge-center/internet-access-lambda-function/
https://aws.amazon.com/blogs/apn/why-use-aws-lambda-in-a-custom-vpc/

More differences

There are more links out there which compare Azure Function with Lambda and Cloud Functions from Google cloud.

https://cloudacademy.com/blog/microsoft-azure-functions-vs-google-cloud-functions-fight-for-serverless-cloud-domination-continues/

Tuesday, March 20, 2018

Circular relationship diagram for Microservices and Serverless architectures

Microservices is the trend for splitting large systems to individual self contained subsystems. With the advancement of Serverless offerings more and more systems are created or moving towards Microservice approach.

Is Microservice same as Serverless?

Before we go further let us clarify the difference between Microservice and Serverless. There could be some readers who thinks both are same and Serverless is more advanced. To be frank, there are fundamental differences between these two.

Microservice - It is about splitting the systems into small sub systems which are deliverable by them selves. Microservices can be hosted either in Cloud or in onpremise data centers. Microservice has to manage its own data and that data should never be touched by other systems directly.
If the system consists of couple of services and both are maintained by same team, those services doesn't necessarily be separate.

Serverless - It is about freeing the developers from worrying about underlying infrastructure. They just have to write the code and give to the hosting environment. The environment takes care from there about scaling, fault tolerance, patching etc... to maintain a promised parameters such as  predictable throughput, OS level security etc...

Problem

The problem with Microservice is the integration of Microservices. If it is single big monolith system the integration is handled by the class relationship. When we move to Microservice the complexity is moved to the service integration level from the class level. The overall complexity of the system is not changing. It would be easy when we start Microservice based system from scratch or convert existing. But it gets difficult when the count of Microservices goes beyond 50 or 100. Especially to understand their relation to each other. Who calls who and what protocol ie ie dependency. 
The technical solution for this problem is to implement Service mesh or service discovery mechanism so that we have the connection details at central place.

But how do we create a diagram to represent those relationships. If we use traditional blocks and arrows model, it will take more time to understand than the time for coding the same. So lets take another approach.

Circular Relationship Diagram

It doesn't seems that there is a standard name for a diagram which puts the entities in the edge of a circle and shows connections by drawing lines between those through the circle. As programmers lets try to generate one by code than buying a software.

D3.JS for Circular Relationship Diagram

Most of us would be familiar with the D3.JS library for charting in the browser. That library have the support for creating circular relationship diagram. Below goes one link.

http://mbostock.github.io/d3/talk/20111116/bundle.html

The above is not intended for plotting the relationship between the Microservices. But as programmers we can easily change its data to provide our Microservice relationship. The above link doesn't need any server side coding. All it needs is web server to serve HTML pages. It can be IIS, Apache or anything.

Thanks to the D3 library and author of the sample.

Other references

https://github.com/nicgirault/circosJS
https://modeling-languages.com/javascript-drawing-libraries-diagrams/

Tuesday, March 13, 2018

Uploading large files from browser to ASP.Net web server - WebSockets

Problem

Most of the applications are now web applications. When we are in browser we are in restricted area. We cannot code freely to leverage machine's full capabilities. One of the area where the limitation applies is on uploading large files which is a normal use case. Small files we can easily upload using file input control. But when the file size grows we get into trouble. 
There are so many solutions out there including third party controls. But what if we need to write one from scratch? What are the options?

Alternatives

We can use ActiveX controls which helps to unleash the potential of the machine. With the arrival of HTML5 FileAPI we have more control in the HTML itself. There are so many third parties who are making use of the same. If there is a way to use ActiveX or third party controls, the problem is solved. Else or you think the existing solutions are not enough, continue reading.

Solutions

There are many ways to upload files. Below are the principles taken on the solution approach

Principles

  • Never use ActiveX
  • Read the file without breaking the browser memory (chunking)
  • Should support up to 25GB file minimum.
  • Support scaling
There are reasons behind the principles but are not relevant at this point. Below is the first solution and that is going to be discussed in this post.

WebSockets

One of the modern way is the WebSockets. Below goes a repo which demos how the file can be read async and sync methods and send via WebSockets.

https://github.com/joymon/large-file-upload-from-browser

Feels free to surf the repo and comment. Will be adding more development notes soon.

Results

  • Using a Web Worker and FileReaderSync API seems faster than async FileReader API.

Limitations

Below goes some limits
  1. The connection has to be open. Affects scaling.
  2. Socket connection management is tedious
  3. The server is slower in reading and writing the files. Client reports it send all the data.

References

Tuesday, March 6, 2018

Azure @ Enterprise - Limits on AppServices

AppService mechanism allows free tier to host Azure WebApps. Though this has 2 min execution timeout, it is a good place to host web sites.If we ask is this enterprise friendly, we will end up in different answers. It has enterprise friendly mechanism called AppServiceEnvironment to host applications in isolated environment ie an environment which is not open to public. Enterprise can host its internal services in it.

Though it says technically we can host any .Net application, it has good number of limits. For example if we have legacy application which uses any COM component, it cannot be used in AppService. This post is a journey towards such a limit

Puppeteer

It is a browser specifically Chrome automation framework in NodeJS. It helps to start headless browser instance and control it via JavaScript.  It needs a separate post to explain about Puppeteer. Ideally if we host the code in Azure NodeJS WebApp, it should work.

But fails due to unavailability of of some APIs in WebApp.

Details on WebAPI limits 

Below goes the error message
Error: spawn UNKNOWN
    at _errnoException (util.js:1022:11)
    at ChildProcess.spawn (internal/child_process.js:323:11)
    at Object.exports.spawn (child_process.js:502:9)
    at Function.launch (D:\home\site\wwwroot\node_modules\puppeteer\lib\Launcher.js:81:40)
    at Function.launch (D:\home\site\wwwroot\node_modules\puppeteer\lib\Puppeteer.js:25:21)
    at Server.<anonymous> (D:\home\site\wwwroot\server.js:14:39)
    at emitTwo (events.js:126:13)
    at Server.emit (events.js:214:7)
    at parserOnIncoming (_http_server.js:602:12)
    at HTTPParser.parserOnHeadersComplete (_http_common.js:116:23) code: 'UNKNOWN', errno: 'UNKNOWN', syscall: 'spawn' }
started server.js

How to reach to a conclusion that Puppeteer not supported in Azure WebApp? Lets look at the below issue logged in their GitHub project


This will take us to the Azure Functions page which says Windows based Azure Functions don't support headless browsers? Really are there windows based functions and Linux based functions?

Unfortunately yes. The Below link explains the same.

This again takes us to the kudu page where the limit is documented.
https://github.com/projectkudu/kudu/wiki/Azure-Web-App-sandbox#win32ksys-user32gdi32-restrictions

Moral of the story. Read the docs and understand what it really means.