Planet Cloud [Computing]

May 19, 2012

OakLeaf Systems

Gravatar

Windows Azure and Cloud Computing Posts for 5/17/2012+

A compendium of Windows Azure, Service Bus, EAI & EDI, Access Control, Connect, SQL Azure Database, and other cloud-computing articles. image222

image433

Note: This post is updated daily or more frequently, depending on the availability of new articles in the following sections:


Azure Blob, Drive, Table, Queue and Hadoop Services

Benjamin Guinebertière (@benjguin) posted Windows Azure tables: partitioning guid row keys on 5/18/2012. From the Engish version:

imageThis post is about using a partition key while the rowkey is a GUID in Windows Azure Table Storage. You may read more on Windows Azure Tables partitioning in this MSDN article.

Here is some context on when this sample code comes from.

imageI’m currently developing some sample code that works as a canvas application in Facebook. As this type of app runs in a iFrame, it restricts the use of cookies. Still, I wanted to keep some authentication token from page to page (the Facebook signed_request) without displaying it explicitly in the URL (it may also be sent to HTTP referer headers to external sites). So i decided to store the signed_request in a session. ASP.NET sessions can have their ID stored in the URL instead of cookies but ASP.NET pipelines does not provide the session yet while authenticating. So I created my own storage for the authentication and authorization session (Auth Session for short). I did it in Windows Azure tables so that it can easily scale out.

The functional key is a GUID (I don’t want user X to guess user Y’s authSessionKey). The key is passed from page to page as a query parameter (typically, app urls look like this: https://myhost/somepath?authSessionKey=3479D7A2-5D1A-41A8-B8FF-4F62EB1A07BB.

Still, in order to have this scaling horizontally I need to have a partition key. Here is the code I used:

internal class AuthSessionDataSource
{
//...
        public const int nbPartitions = 15;
// ...

public static class AuthSessionState
{
//...
    private static string PartitionKeyForGuid(Guid guid)
    {
        int sumOfBytes = 0;
        foreach (var x in guid.ToByteArray())
        {
            sumOfBytes += x;
        }
        int partitionNumber = sumOfBytes % AuthSessionDataSource.nbPartitions;
        return partitionNumber.ToString();
    }
//...

The principle is to get the remainder of the sum of all bytes participating in the GUID divided by the number of partitions as the partition number.

In order to have a rough idea of what it provides, here is a small console application (code, then sample execution result).

using System;
using System.Collections.Generic;
using System.Linq;
using System.Text;

namespace ConsoleApplication2
{
    class Program
    {
        static void Main(string[] args)
        {
            for (int i = 0; i < 36; i++)
            {
                Guid g = Guid.NewGuid();
                Console.WriteLine("{0}, {1}", g, PartitionKeyForGuid(g));
            }
            Console.ReadLine();


        }


        private static string PartitionKeyForGuid(Guid guid)
        {
            const int nbPartitions = 12;

            int sumOfBytes = 0;
            foreach (var x in guid.ToByteArray())
            {
                sumOfBytes += x;
            }
            int partitionNumber = sumOfBytes % nbPartitions;
            return partitionNumber.ToString();
        }

    }
}

image

The advantage is that the partition numbers should be distributed quite regularly and that you can get calculate the partition from the rowkey as long as the number of partitions doesn’t change.

Should I change and have more partitions as the number of users grow, I could store new users’ sessions to a new table where the number of partitions is higher while keeping already active users to the old table. Auth sessions don’t live very long so changing the number of partitions can be quite simple.

public const int nbDaysForOldSessions = 3;
//...
internal void RemoveOldValues()
{
    DateTime oldDate = DateTime.UtcNow.AddDays(-1 * nbDaysForOldSessions);

    for(int p=0; p<nbPartitions; p++)
    {
        string partitionKey = p.ToString();
        var query = (from c in context.AuthSessionStateTable
                     where c.PartitionKey == partitionKey
                     && c.Timestamp <= oldDate
                     select c)
                    .AsTableServiceQuery<AuthSessionStateEntity>();
        var result = query.Execute();
        int i = 0;
        foreach (var x in result)
        {
            i++;
            this.context.DeleteObject(x);
            if (i >= 100)
            {
                this.context.SaveChangesWithRetries();
                i = 0;
            }
        }
        this.context.SaveChangesWithRetries();
    }
}

Why not using the rowkey as a partition key? Well having several rows in the same partition allows batching which is also good for performance. For instance, I have to remove old sessions. As batch can only happen in a same partition and as no more than 100 rows can be batched together, here is the code to purge old Auth sessions:

In my case, having this way of partitioning data, seems to be a good fit.
Dans mon cas, partitionner les données de cette façon semble assez adapté.


Derrick Harris (@derrickharris) posted The unsexy side of big data: 5 tools to manage your Hadoop cluster to GigaOm’s Structure blog on 5/18/2012:

imageBefore you can get into the fun part of actually processing and analyzing big data with Hadoop, you have to configure, deploy and manage your cluster. It’s neither easy nor glamorous — data scientists get all the love — but it is necessary. Here are five tools (not from commercial distribution providers such as Cloudera or MapR) to help you do it.

Apache Ambari

image_thumb3_thumbApache Ambari is an open source project for monitoring, administration and lifecycle management for Hadoop. It’s also the project that Hortonworks has chosen as the management component for the Hortonworks Data Platform. Ambari works with Hadoop MapReduce, HDFS, HBase, Pig, Hive, HCatalog and Zookeeper.

Apache Mesos

Apache Mesos is a cluster manager that lets users run multiple Hadoop jobs, or other high-performance applications, on the same cluster at the same time. According to Twitter Open Source Manager Chris Aniszczyk, Mesos “runs on hundreds of production machines and makes it easier to execute jobs that do everything from running services to handling our analytics workload.”

Platform MapReduce

Platform MapReduce is high-performance computing expert Platform Computing’s entre into the big data space. It’s a runtime environment that supports a variety of MapReduce applications and file systems, not just those directly associated with Hadoop, and is tuned for enterprise-class performance and reliability. Platform, now part of IBM, built a respectable business managing clusters for large financial services institutions.

StackIQ Rocks+ Big Data

StackIQ Rocks+ Big Data is a commercial distribution of the Rocks cluster management software that the company has beefed up to also support Apache Hadoop. Rocks+ supports the Apache, Cloudera, Hortonworks and MapR distributions, and handles the entire process from configuring bare metal servers to managing an operational Hadoop cluster.

Zettaset Orchestrator

Zettaset Orchestrator is an end-to-end Hadoop management product that supports multiple Hadoop distributions. Zettaset touts Orchestrator’s UI-based experience and its ability to handle what the company calls MAAPS — management, availability, automation, provisioning and security. At least one large company, Zions Bancorporation, is a Zettaset customer.

If there are more Hadoop management tools floating around, please let me know in the comments.

Feature image courtesy of Shutterstock user .shock.

Related research and analysis from GigaOM Pro:
Subscriber content. Sign up for a free trial.


Shaun Connolly posted 7 Key Drivers for the Big Data Market to the Hortonworks blog on 5/14/2012 (missed when posted):

imageI attended the Goldman Sachs Cloud Conference and participated on a panel focused on “Data: The New Competitive Advantage”. The panel covered a wide range of questions, but kicked off covering two basic questions:

“What is Big Data?” and “What are the drivers behind the Big Data market?”

imageWhile most definitions of Big Data focus on the new forms of unstructured data flowing through businesses with new levels of “volume, velocity, variety, and complexity”, I tend to answer the question using a simple equation:

Big Data = Transactions + Interactions + Observations

The following graphic illustrates what I mean:

Big Data Diagram

ERP, SCM, CRM, and transactional Web applications are classic examples of systems processing Transactions. Highly structured data in these systems is typically stored in SQL databases.

Interactions are about how people and things interact with each other or with your business. Web Logs, User Click Streams, Social Interactions & Feeds, and User-Generated Content are classic places to find Interaction data.

Observational data tends to come from the “Internet of Things”. Sensors for heat, motion, pressure and RFID and GPS chips within such things as mobile devices, ATM machines, and even aircraft engines provide just some examples of “things” that output Observation data.

With that basic definition of Big Data as background, let’s answer the question:

What are the 7 Key Drivers Behind the Big Data Market?

Business

  1. Opportunity to enable innovative new business models
  2. Potential for new insights that drive competitive advantage

Technical

  1. Data collected and stored continues to grow exponentially
  2. Data is increasingly everywhere and in many formats
  3. Traditional solutions are failing under new requirements

Financial

  1. Cost of data systems, as a percentage of IT spend, continues to grow
  2. Cost advantages of commodity hardware & open source software

There’s a new generation of data management technologies, such as Apache Hadoop, that are providing an innovative and cost effective foundation for the emerging landscape of Big Data processing and analytics solutions. Needless to say, I’m excited to see how this market will mature and grow over the coming years.

Key Takeaway

Being able to dovetail the classic world of Transactions with the new(er) worlds of Interactions and Observations in ways that drives more business, enhances productivity, or discovers new and lucrative business opportunities is why Big Data is important.

One promise of Big Data is that companies who get good at collecting, aggregating, refining, analyzing, and maximizing the value derived from Transactions, Interactions, and Observations will put themselves in a position to answer such questions as:

What are the behaviors that lead to the transaction?

And even more interestingly:

How can I better encourage those behaviors and grow my business?

So ask yourself, what’s your Big Data strategy?


<Return to section navigation list>

SQL Azure Database, Federations and Reporting

Mike Benkovich (@mbenko) continued his series with CloudTip #14-How do I get SQL Profiler info from SQL Azure? on 5/18/2012:

imageYour application is running slow. You need to find out what’s going on. If you’ve used SQL Profiler on a local database you might be familiar with how you can capture a trace of database activity and use it to figure out where your resources are going. The visibility makes it MUCH easier to tune a database than sorting thru a bunch of code. The question is, what do you do when you’re moving an app to the cloud?

imageIf you’ve wondered how you can get Profile information from SQL Azure, the new online management portal for SQL Azure has been updated with design, deployment, administration and tuning features built in. The Overview screen provides quick links to the different areas of the portal, as well as easy links to help information from msdn online. You can get to the portal either by going to the Windows Azure management portal on http://windows.azure.com and after signing in going to the database section and clicking Manage, or simply browsing to your database name – https://<myserver>.database.windows.net where you substitute your database server’s name for <myserver>.

image

When I log in I can see my databases and get information about size, usage as well as the ability dive into specific usage. From there I can go into designing the schema, functions and code around my database. If I swap over to the admin page though, I have visibility into not just database size and usage, but also a link to query performance. Clicking this takes me to where I can see profile data from queries.

image

I can sort and see which calls to the database are most frequent as well as most expensive in terms of resource usage. Further I can select one and dive even deeper to see the execution plan and statistics around the calls. This information is key to making decisions on indexes and design of a well performing database.

image

In the query plan I can look for table scans or other expensive operations and if it make sense determine whether additional indexes would be useful.

image

Nice!


Cihan Biyikoglu (@cihangirb) explained ID Generation in Federations in Azure SQL Database: Identity, Sequences, Timestamp and GUIDs (Uniqueidentifier) on 5/17/2012:

imageIdentity and timestamp are important pieces of functionality for many existing apps for generating IDs. Federation impose some restrictions on identity and timestamp and clearly we need alternatives for federations that can scale to the targets of scale federations hits. So I’ll dive into alternatives and options in this post.

imageLets take a look at identity and timestamp first in detail to understand why they are not good fits in federations: Identity is an unique id generation scheme and timestamp is typically used in optimistic concurrency implementation. Digging a little deeper, identity promises to generate a linearly increasing value without gaps scoped to a table in a given database. It provides the ability to reseed and provides an easy way to ensure uniqueness for many apps. Timestamp similarly provides a uniquely increasing value for all updates scoped to the database. Federations use databases as its building block and in an elastic fashion thus change the scope of databases as federation is repartitioned. This means you may end up with duplicate values generated in 2 separate members that got merged together when identity or timestamp is used on columns. We could generate the values in the root and make them globally unique but we end up with a single choke point that literally will limit your throughput. so what to do?

There are a number of options for distributed systems.

GUID/Uniqueidentifier as a unique id generation method: I strongly recommend using uniqueidentifier as a identifier. It is globally unique by definition and does not require funneling generation through some centralized logic. Unlike identity and timestamp, uniqueidentifiers can be generated at any tier of the app. With unique identifiers you give up on one property; ID generation is no longer sequential. So what If you’d like to understand the order in which a set of rows were inserted? Well that is easy to do in an atomic unit. You can use a date+time data type with high enough resolution to give you ordering: ex: datetime2 or datetimeoffset data types both have resolution to 1/1000000 fraction of a second. So these types have great precision for ordering events.

This is more of an academic topic and don’t expect many folks to try this but, I’ll still mention that I strongly trying ordering across atomic units. Here is the core of the issue; If you need to sort across AUs, datetimeoffset still may work. However it is easy to forget that there isn’t a centralized clock in a distributed system. Due to the random number of repartitioning operations that may have happened over time, the date+time value may be coming from many nodes and nodes are not guaranteed to have sync clocks (they can be a few mins apart). Given no centralized clock, across atomic units datetime value may not reflect the exact order in which things happened.

Well, how about the difficulty of partitioning over ranges of uniqueidentifiers? GUIDs are sortable so this is nothing new but their sort order may get confusing. However it is fairly easy to understand once you see the explanation. This one explains the issue well; How are GUIDs sorted by SQL Server. It is a mind stretching exercise but I expect we’ll have tooling to help out with some of this in future.

Last but not the least, many people have experiences that suggest GUIDs (uniqueidentifiers) are bad candidates for clustering keys given they will not be ordered and cause page splits, causing higher latencies and fragmentation? No so on SQL Azure. at least not to the degree you experience in on premise SQL Server. SQL Azure dbs give you 3 replicas and that means the characteristics of writes are very different compared to a single SQL DB without HA. In SQL Azure the write have to be confirmed by 2 out of the 3 copies thus are always a network level writes… A network write is much slower in latency compared to what a page split would cause. Page split makes a tiny amount of this latency and is not visible to naked eye. You do end up with some fragmentation with uniqueidentifier and that is true. However fragmentation is hard to completely get rid of and unordered inserts compared to deletes or expanding updates don’t cause as much fragmentation so my experience simply says clustering on uniqueidentifiers is no reason for worry. This is easy to try; simply run the following script and watch your latencies. Here is a quick test you can try: See if you can make an unordered insert like the case for GUIDs take longer over many inserts:

use federation root with reset
go
drop table t1
drop table t2
go
create table t1(c1 int identity primary key, 
  c2 uniqueidentifier default newid(), 
  c3 char(200) default 'a')
create table t2(c1 int identity, 
  c2 uniqueidentifier default newid() primary key, 
  c3 char(200) default 'a')
go
-- MEASURE T1 
set nocount on
declare @s datetime2
set @s=getdate()
declare @i int
set @i=0
begin tran
while (@i<1000000)
begin
     if (@i%1000=0)
          begin 
               commit tran
               begin tran
          end
     insert into t2(C2, C3) values(default,default)
     set @i=@i+1
end
commit tran
select datediff(ss,@s, getdate()) 'total seconds for t1'
go
-- MEASURE T2
set nocount on
declare @s datetime2
set @s=getdate()
declare @i int
set @i=0
begin tran
while (@i<1000000)
begin
     if (@i%1000=0)
          begin 
               commit tran
               begin tran
          end
     insert into t2(C2, C3) values(default,default)
     set @i=@i+1
end
commit tran
select datediff(ss,@s, getdate()) 'total seconds for t2'
go
  • Datetime2 for optimistic concurrency: Timestamp replacement is much easier to talk about; I recommend using datetime2 with its 1/1000000 of a second resolution instead of timestamp. Simply ensure that you generate ever[y] time you update, you also update the modified_date and ensure to use that just like timestamp to compare before updates to detect update conflicts.

That said, we plan to enhance these functions to be meaningful for federations in future and will also remove some restrictions in members especially on reference tables. if you need any more details on these, you can always reach me through the blog.


Bud Aaron posted Writing a small application to manipulate the [Windows Azure] SQL Database [a.k.a., SQL Azure] firewall on 5/18/2012:

imageArg – it’s no longer SQL Azure, it’s now SQL Database! This article started out as a simple discussion of how to manipulate the SQL Database (was SQL Azure) firewall through REST calls, but on the way Microsoft threw me under the bus by completely changing branding names for what was Azure. To crawl out from under the bus I decided to use the new naming conventions which are listed below for all the world to see.

image

imageI have to say I’m not terribly happy about the changes but that’s probably not going to change anything so I guess I just need to update my thinking.

Now I know just enough about firewalls to be dangerous. I know they’re designed to help prevent uninvited guests from messing with my data, and that you can poke holes in them to allow invited guests into the database. I call it poking holes in the firewall, here’s a walkthrough on getting through a firewall to allow sqlserver.exe remote access.

Doing it the easy way

The Windows Azure Portal provides a way to Add, Edit and Delete firewall rules. First Select Database in the portal and navigate to the SQL database you’ve set up.

Windows Azure Portal

Select the Firewall Rules: button under Server Information to get this:

Firewall Information

Notice that the button says Firewall Rules followed by “: 2” indicating that 2 rules are in place. Click the Add button to get this:

Add new firewall rule

Fill in a new firewall rule:

Fill in firewall rule information

And click OK to get this:

Firewall rule overview

You can also Update or Delete the rules here. This is truly the simplest way to poke holes in the firewall as needed but now let’s do it by writing a program in Visual Studio 11 using Visual Basic.

First Things First

In order to get this done you will need to create a self-signed certificate. In order to make the certificate easily findable just make a temp folder on the C: drive. It’s a very short navigation trip when you need to point to files such as the certificate we’re going to generate.

Creating a certificate

In order to do many of the things we’re planning to, we’ll need an encryption certificate. In a production environment you will want to get your certificate from one of the many companies that who issue certificates but for development you can use the makecert command to generate a self-signed certificate. To do this I suggest you create a C:\temp folder making it easy to retrieve. In the Start menu under Visual Studio 2010 Tools you will find a command prompt. Click this to bring up the VS 2010 command prompt and navigate to your newly created temp folder.

It may be that I’m the only person in the world who just recently learned how to ‘paste’ in a command window but I’m so proud. On the off chance that you’ve never used it, I’m going to explain how it’s done. Clicking the little C:\ icon in the upper left corner of the command window brings up the following menu.

Now you can copy the makecert command shown below into the command window and press return to execute the command. Better yet copy it into notepad and edit it to suit your needs, then copy and paste to execute the command.

makecert -sky exchange -r -n "CN=dnccert.cer" -pe -a sha1 -len 2048 -ss My "dnccert.cer"

I tell you this because I HATE trying to type long commands into the command window because I invariably mistype at least a half dozen times and then frequently get an error. So now I compose the command in notepad and then copy and paste it. I wish I’d known about this years ago. You’ll get the following message when your certificate has been successfully created in the C:\temp folder.

You can find more detailed information about the makecert command here:http://msdn.microsoft.com/en-US/library/bfsktky3(v=VS.80).aspxAdd the certificate to your portalOpen your Windows Azure management portal and select Hosted Services, Storage Accounts & CDN. Then select Management Certificates. You should see this screen:Hosted Services, Storage Accounts & CDN

Click the Add Certificate icon in the top left corner. In the dialog box, browse for the certificate in the C:\temp folder if that’s where you saved the certificate.

Add Management Certificate

Click OK to import the certificate and then make a copy of the Thumbprint in the text file you should be using to save things you need for this project.

Managing the Azure Firewall

The app we’re building is named SQLAzureFirewallManagement. The main form will be named FirewallDetails. I will use the capture below to give you layout details for the main form. I’ve given each of the controls a number and the table following the dialog will show that number followed by the control type, its name and text, followed by its left and top position, and finally its height and width.

Now the code:
Imports System.Collections.Generic
Imports System.ComponentModel
Imports System.Data
Imports System.Drawing
Imports System.Linq
Imports System.Text
Imports System.Threading.Tasks
Imports System.Windows.Forms
Imports System.Xml.Linq
Imports System.Security.Cryptography.X509Certificates
Imports System.IO
Imports System.Net

Public Class FirewallDetails

    Private Sub btnBrowse_Click(sender As Object, e As EventArgs)
Handles btnBrowse.Click
        Dim input As String = String.Empty
        Dim ofd As New OpenFileDialog()
        ofd.Filter = "cer files (*.cer | *.cer"
        ofd.InitialDirectory = "C:\temp"
        ofd.Title = "Select a certificate"

        If ofd.ShowDialog() = System.Windows.Forms.DialogResult.OK Then
            txtCertPath.Text = ofd.FileName
        End If

    End Sub

    Private Sub btnOK_Click(sender As Object, e As EventArgs) Handles btnOK.Click
        Dim certfile As String
        Dim subscriptionID As String
        Dim servername As String

        certfile = txtCertPath.Text
        subscriptionID = txtSubID.Text
        servername = txtServerName.Text

        firewallList.Items.Clear()
        GetServerFirewallRules(certfile, subscriptionID, servername)

    End Sub

    Private Sub GetServerFirewallRules(certfile As String, subscriptionID
As String, server As String)
        Try
            Dim url As String = String.Format("https://management.database
.windows.net:8443/{0}/servers/{1}/firewallrules", subscriptionID, server)
            Dim webRequest As HttpWebRequest = TryCast(HttpWebRequest.Create(url),
HttpWebRequest)

            webRequest.ClientCertificates.Add(New X509Certificate(certfile,
"private key password"))
            webRequest.Headers("x-ms-version") = "1.0"
            webRequest.Method = "GET"

            comboList.Items.Clear()

            Using webresponse As WebResponse = webRequest.GetResponse()
                Using stream As Stream = webresponse.GetResponseStream()
                    Using sr As New StreamReader(stream)
                        Dim xml As String = sr.ReadToEnd()
                        Dim doc As XDocument = XDocument.Parse(xml)

                        Dim sc As XNamespace = "http://schemas.microsoft.com
/sqlazure/2010/12/"

                        Dim query = From s In doc.Elements(sc + "FirewallRules")
.Elements(sc + "FirewallRule") Select s

                        firewallList.Items.Add("=========================")

                        For Each elm As XElement In query
                            firewallList.Items.Add(elm.Element(sc + "Name")
.Value.ToString())
                            comboList.Items.Add(elm.Element(sc + "Name")
.Value.ToString())
                            firewallList.Items.Add(elm.Element(sc +
"StartIpAddress").Value.ToString())
                            firewallList.Items.Add(elm.Element(sc +
"EndIpAddress").Value.ToString())

                            firewallList.Items.Add("=========================")
                        Next
                    End Using
                End Using
            End Using
        Catch webEx As WebException
            Dim errorResponse As HttpWebResponse = DirectCast(webEx.Response,
HttpWebResponse)

            Try
                Using errrs As Stream = errorResponse.GetResponseStream()
                    Using sr As New StreamReader(errrs)
                        MessageBox.Show(sr.ReadToEnd().ToString())
                    End Using
                End Using
            Catch innerex As Exception
                MessageBox.Show(innerex.ToString())
            End Try
        Catch ex As Exception
            MessageBox.Show(ex.ToString() + vbLf)
        End Try
    End Sub

    Private Sub btnAdd_Click(sender As Object, e As EventArgs) Handles
btnAdd.Click
        Dim certfile As String
        Dim subscriptionID As String
        Dim servername As String
        Dim ruleName As String
        Dim startIP As String
        Dim endIP As String

        certfile = txtCertPath.Text
        subscriptionID = txtSubID.Text
        servername = txtServerName.Text
        ruleName = txtRuleName.Text
        startIP = txtStartIP.Text
        endIP = txtEndIP.Text

        SetServerFirewallRule(certfile, subscriptionID, servername, ruleName,
startIP, endIP)

    End Sub

    Private Sub SetServerFirewallRule(certfile As String, subscriptionID As String,
server As String, ruleName As String, startIP As String, endIP As String)
        Try
            Dim url As String = String.Format("https://management.database
.windows.net:8443/{0}/servers/{1}/firewallrules/{2}", subscriptionID, server,
ruleName)
            Dim webRequest As HttpWebRequest = TryCast(HttpWebRequest.Create(url),
HttpWebRequest)

            webRequest.ClientCertificates.Add(New X509Certificate(certfile,
"private key password"))
            webRequest.Headers("x-ms-version") = "1.0"
            webRequest.Method = "PUT"

            Dim xmlbody As String = "<?xml version=""1.0"" encoding=""utf-8""?
>" + vbLf + "<FirewallRule " + vbLf + "
xmlns=""http://schemas.microsoft.com/sqlazure/2010/12/"" " + vbLf +
" xmlns:xsi=""http://www.w3.org/2001/XMLSchema-instance"" " + vbLf +
" xsi:schemaLocation=""http://schemas.microsoft.com/sqlazure/2010/12/
FirewallRule.xsd""> " + vbLf + " <StartIpAddress>" + startIP.ToString()
+ "</StartIpAddress>" + vbLf + " <EndIpAddress>" + endIP.ToString()
+ "</EndIpAddress" + vbLf + ">" + "</FirewallRule>"

            Dim bytes As Byte() = Encoding.UTF8.GetBytes(xmlbody)
            webRequest.ContentLength = bytes.Length
            webRequest.ContentType = "application/xml;charset=uft-8"

            Using requestStream As Stream = webRequest.GetRequestStream()
                requestStream.Write(bytes, 0, bytes.Length)
            End Using

            Using response As WebResponse = webRequest.GetResponse()
                MessageBox.Show("Rule Added")
            End Using
        Catch webEx As WebException
            Dim errorResponse As HttpWebResponse = DirectCast(webEx.Response,
HttpWebResponse)

            Try
                Using errrs As Stream = errorResponse.GetResponseStream()
                    Using sr As New StreamReader(errrs)
                        MessageBox.Show(sr.ReadToEnd().ToString())
                    End Using
                End Using
            Catch innerex As Exception
                MessageBox.Show(innerex.ToString())
            End Try
        Catch ex As Exception
            MessageBox.Show(ex.ToString() + vbLf)
        End Try
    End Sub

    Private Sub btnDelete_Click(sender As Object, e As EventArgs) Handles
btnDelete.Click
        Dim certfile As String
        Dim subscriptionID As String
        Dim servername As String
        Dim ruleName As String

        certfile = txtCertPath.Text
        subscriptionID = txtSubID.Text
        servername = txtServerName.Text
        ruleName = comboList.SelectedItem.ToString()

        DeleteServerFirewallRule(certfile, subscriptionID, servername,
ruleName)

    End Sub

    Private Sub DeleteServerFirewallRule(certfile As String, subscriptionID As
String, server As String, ruleName As String)
        Try
            Dim url As String = String.Format("https://management.database.windows
.net:8443/{0}/servers/{1}/firewallrules/{2}", subscriptionID, server, ruleName)
            Dim webRequest As HttpWebRequest = TryCast(HttpWebRequest.Create(url),
HttpWebRequest)

            webRequest.ClientCertificates.Add(New X509Certificate(certfile,
"private key password"))
            webRequest.Headers("x-ms-version") = "1.0"
            webRequest.Method = "DELETE"

            Using wr As WebResponse = webRequest.GetResponse()
                Using stream As Stream = wr.GetResponseStream()
                    Using sr As New StreamReader(stream)
                        MessageBox.Show("Rule Deleted")
                        firewallList.Items.Clear()
                    End Using
                End Using
            End Using
        Catch webEx As WebException
            Dim errorResponse As HttpWebResponse = DirectCast(webEx.Response,
HttpWebResponse)

            Try
                Using errrs As Stream = errorResponse.GetResponseStream()
                    Using sr As New StreamReader(errrs)
                        MessageBox.Show(sr.ReadToEnd().ToString())
                    End Using
                End Using
            Catch innerex As Exception
                MessageBox.Show(innerex.ToString())
            End Try
        Catch ex As Exception
            MessageBox.Show(ex.ToString() & vbLf)
        End Try
    End Sub
End Class

<Return to section navigation list>

MarketPlace DataMarket, Social Analytics, Big Data and OData

My (@rogerjenn) Updated Accessing US Air Carrier Flight Delay DataSets on Windows Azure Marketplace DataMarket and “DataHub” post gained a Visualizing Flight Delay Data with Tableau Software section on 5/19/2012:

imageVisualizing Flight Delay Data with the Tableau Public Application

Tableau Software (@Tableau) publishes data visualization software with an emphasis on big data. According to the Tableau blurb on OakLeaf’s US Air Carrier Flight Delays offering on the Windows Azure Marketplace DataMarket:

imageTableau provides drag-and-drop data visualization based on best practices and patented technology from Stanford University. Tableau allows you to publish dashboards to the web with one click. It’s rapid-fire business intelligence that anyone can use.

According to the publisher:

Tableau Public is a free service that lets you create and share data visualizations on the web. Thousands use it to share data on websites and blogs and through social media like Facebook and Twitter. Tableau Public allows you to see data efficiently and powerfully without any programming.

Easy drag & drop interface:
  • No programming language
  • No plug-ins
  • No Flash, so it shows up on the iPad …
How it works:

Tableau Public visualizations and data are always public. Anyone can interact with your visualizations, see your data, download it, and create their own visualizations from it.

When you save your visualizations, it will be to the publically accessible Tableau Public web servers -- nothing is saved locally on your computer. You can then embed your visualization on your blog or website or share it through social media or email.

Tableau Public can connect to several data sources, including Microsoft Excel, Microsoft Access, and multiple text file formats. It has a limit of 100,000 rows of data allowed in any single file and there is a 50 megabyte limit on storage space for data. …

Warning: As noted in the preceding quotation, Tableau Public works with a maximum of 100,000 data rows, which you won’t discover until you attempt to use a query that returns more than that amount of data.

imageThe DataMarket provides a terse Using Marketpalce [sic] Datasets with Tableau Public tutorial with no screen captures of successful visualizations. The following procedure doesn’t suffer from those omissions:

1. Download Tableau Public 7.0 from its download page. The download requires entering your email address.

Tip: The Thank You for Downloading … page appears immediately, but you must wait for a few minutes for the Do You Want to Run or Save TableauDesktop.msi … ? message to appear before taking additional actions.

image_thumb[1]

2. Click Run to install the software, accept the license agreement, and click Install. Watch the Getting Started video, if you want. Close the Book 1 page.

3. Open the US Air Carrier Flight Delays dataset, sign into the DataMarket with an account that has a subscription to the dataset, click the Explore This Dataset link, and specify LAX as the optional parameter, which returns 86.940 rows with the current dataset. Click the Export button to open the Export pane, and mark the Tableau option in the Export to Program section:

image_thumb[10]

4. Click the lower Download button and click Open when presented with the following message:

image_thumb[7]

5. Click the Show button (see step 3’s screen capture) to display your Primary Account Key, copy it to the Clipboard and paste it in Tableau Public’s Login dialog:

image_thumb[12]

6. Click OK to download the data and open Tableau Public’s main page and drag the Carrier dimension to the top Drop Field Here region, as shown here:

image_thumb[19]

7. Select the DepDelayMinutes measure and drag it to the Rows shelf; select the Carrier dimension and drag it to the Columns shelf to enable appropriate chart styles in the Show Me gallery.

8. Open the Rows menu, select Measure (Sum) and choose Average in the submenu:

image_thumb[21]

9. Here’s the basic column chart from the preceding steps.

image_thumb[23]

You can edit the chart and axis titles, but due to the limited number of rows accommodated, further work usually isn’t warranted.

I have recommended that the DataMarket team add notice of the limitation in the number of rows supported by Tableau Public.


The Astoria (WCF Data Services) Team announced WCF Data Services 5.0.1 Released on 5/18/2012:

imageWe have just pushed the official bits for 5.0.1 to NuGet and should be releasing an updated MSI sometime within the next week. Thanks to feedback from the community we found and fixed several bugs, detailed below.

NuGet Links
Release Notes
  • Added configuration setting to ODataLib to allow users to intercept the XML reader before ODataLib processes the payload
  • Fixes bug where DateTime values do not roundtrip properly in JSON verbose
  • Fixes bug where $metadata requests fail on services with both actions and service operations
  • Fixes bug where $expand requests fail on Oracle providers
  • Fixes bug where vocabulary annotations in an external annotations file fail to resolve the target
What does the MSI do?

All of the fixes detailed above are in the NuGet bits. There is one additional fix in the MSI for code generation against services that have multiple overloads for a function import. One important nuance of the MSI is that it removes the WCF Data Services DLLs from the GAC. (The 5.0.1 MSI is an upgrade to the 5.0 MSI, which means that the 5.0 MSI gets uninstalled and its install actions – including the GACing of the DLLs – are reversed.) We have blogged previously about leaving the GAC.

Where should the MSI be installed?

The MSI should only be installed on development machines as it solely contains tooling fixes. If the MSI is installed on a production Web server running a WCF Data Services 5.0.0 service that was not bin deployed, the service may stop functioning.

Feedback Please

As we begin to release more quickly and issue more prereleases, your feedback is critical to the process.


David Linthicum (@DavidLinthicum) asserted “Big data is not data warehousing or BI, so there are no proven paths --which is why you need data scientists” in a deck for his Big data in the cloud: It's time to experiment post of 5/18/2012 to InfoWorld’s Cloud Computing blog:

imageThe lure of big data has many people in enterprise IT moving quickly to consolidate and mash up their data assets with other relevant information. The tools are here right now, including big data engines based on Hadoop, public clouds that provide rental access to a huge number of servers, and external cloud-delivered data resources to make better sense of your info.

imageTake, for example, a manufacturing company that can -- thanks to the use of cloud-based big data -- not only establish the output of its factories for the last 10 years but determine how that output compared with others in its industry, as well as the effects of the weather and other external factors. Moreover, it can predict future factory output through the use of proven algorithms and other relevant data models applied to that big data.

Big data is good. The cloud is good. Now, how do we actually make the whole thing work?

The truth is not many best practices have emerged on how to move to big data. We have the migration to data warehousing and business intelligence as an existing model, but as I look at what big data really is, it's clear that big data adoption is a different type of problem. Much of that experience in data warehousing and BI isn't relevant, and it may even lead to some dead ends.

The art of big data is that it consolidates many types of data resources with different structures and data models, all in a massive, distributed storage system. Big data systems may not enforce a structure, though structure can be layered into the data after the migration. But there are trade-offs in going this route, including migrating unneeded and redundant data that takes up space in the big data system.

For now, the proper path is more through trial and error than following proven concepts. The answer to how to best do big data is the classic consultant's response: It depends on what you're trying to do.

The bottom line is that you have to experiment. But you need not do so blindly. The emerging role of data scientist can help direct those experiments within an appropriate framework, in the manner of research scientists in any field. Data scientists can get you the answers to big data, as long as you understand that a scientist must run a lot of experiments.

At this point, experimentation is the best practice in moving to big data. Get a data scientist or two to design and run these trials.


Tony Baer (@TonyBaer) added “Searching for data scientists as a service” as an introduction to his Big Data Brings Big Changes article of 5/18/2012:

It’s no secret that rocket .. err … data scientists are in short supply. The explosion of data and the corresponding explosion of tools, and the knock-on impacts of Moore’s and Metcalfe’s laws, is that there is more data, more connections, and more technology to process it than ever. At last year’s Hadoop World, there was a feeding frenzy for data scientists, which only barely dwarfed demand for the more technically oriented data architects. In English, that means:

  1. Potential MacArthur Grant recipients who have a passion and insight for data, the mathematical and statistical prowess for ginning up the algorithms, and the artistry for painting the picture that all that data leads to. That’s what we mean by data scientists.
  2. People who understand the platform side of Big Data, a.k.a., data architect or data engineer.

The data architect side will be the more straightforward nut to crack. Understanding big data platforms (Hadoop, MongoDB, Riak) and emerging Advanced SQL offerings (Exadata, Netezza, Greenplum, Vertica, and a bunch of recent upstarts like Calpont) is a technical skill that can be taught with well-defined courses. The laws of supply and demand will solve this one – just as they did when the dot com bubble created demand for Java programmers back in 1999.

Behind all the noise for Hadoop programmers, there’s a similar, but quieter desperate rush to recruit data scientists. While some data scientists call data scientist a buzzword, the need is real.

It’s all about connecting the dots, not as easy as it sounds.

However, data science will be a tougher number to crack. It’s all about connecting the dots, not as easy as it sounds. The V’s of big data – volume, variety, velocity, and value — require someone who discovers insights from data; traditionally, that role was performed by the data miner. But data miners dealt with better-bounded problems and well-bounded (and known) data sets that made the problem more 2-dimensional.

The variety of Big Data – in form and in sources – introduces an element of the unknown. Deciphering Big Data requires a mix of investigative savvy, communications skills, creativity/artistry, and the ability to think counter-intuitively. And don’t forget it all comes atop a foundation of a solid statistical and machine learning background plus technical knowledge of the tools and programming languages of the trade.

Sometimes it seems like we’re looking for Albert Einstein or somebody smarter.

Nature abhors a vacuum

As nature abhors a vacuum, there’s also a rush to not only define what a data scientist is, but develop programs that could somehow teach it, software packages that to some extent package it, and otherwise throw them into a meat … err, the free market. EMC and other vendors are stepping up to the plate to offer training, not just on platforms, but for data science. Kaggle offers an innovative cloud-based, crowdsourced approach to data science, making available a predictive modeling platform and then staging sponsored 24-hour competitions for moonlighting data scientists to devise the best solutions to particular problems (redolent of the Netflix $1 million prize to devise a smarter algorithm for predicting viewer preferences).

With data science talent scarce, we’d expect that consulting firms would buy up talent that could then be “rented’ to multiple clients. Excluding a few offshore firms, few systems integrators (SIs) have yet stepped up to the plate to roll out formal big data practices (the logical place where data scientists would reside), but we expect that to change soon.

Opera Solutions, which has been in the game of predictive analytics consulting since 2004, is taking the next step down the packaging route. having raised $84 million in Series A funding last year, the company has staffed up to nearly 200 data scientists, making it one of the largest assemblages of genius this side of Google. Opera’s predictive analytics solutions are designed for a variety of platforms, SQL and Hadoop, and today they join the SAP Sapphire announcement stream with a release of their offering on the HANA in-memory database. Andrew Brust provides a good drilldown on the details on this announcement.

With market demand, there will inevitably be a watering down of the definition of data scientists so that more companies can claim they’ve got one… or many.

From SAP’s standpoint, Opera’s predictive analytics solutions are a logical fit for HANA as they involve the kinds of complex problems (e.g., a computation triggers other computations) that their new in-memory database platform was designed for.

There’s too much value at stake to expect that Opera will remain the only large aggregation of data scientists for hire. But ironically, the barriers to entry will keep the competition narrow and highly concentrated. Of course, with market demand, there will inevitably be a watering down of the definition of data scientists so that more companies can claim they’ve got one… or many.

The laws of supply and demand will kick in for data scientists, but the ramp up of supply won’t be as quick as that for the more platform-oriented data architect or engineer. Of necessity, that supply of data scientists will have to be augmented by software that automates the interpretation of machine learning, but there’s only so far that you can program creativity and counter-intuitive insight into a machine.

You may also be interested in:


imageSee Apigee (@Apigee) announced on 5/18/2012 a Live Webcast: OData Introduction and Impact on API Design scheduled for 5/24/2012 at 11:00 AM PDT in the Cloud Computing Events section below.


David Ramel (@dramel) asserted “Microsoft released demos of both to get them in front of developers; already there's a wishlist of features from developers. Here's some of my initial thoughts and experiments with them” in a deck for his Trying Out the OData/WCF Data Services Upgrades article of 5/17/2012 for Microsoft Certified Professional Magazine Online’s SQL Advisor column:

imageMicrosoft recently updated the Open Data Protocol (OData) and WCF Data Services framework and just last week provided some demo services so data developers can try out the new features.

The WCF Data Services 5.0 release offers libraries for .NET 4 and Silverlight 4 and a slew of new client and server features, including support for actions, vocabularies, geospatial data, serialization/deserialization of OData payloads, "any" and "all" functions in queries and more (including a new JSON format).

imageOData, now at version 3, is the Web protocol for querying data using HTTP, Atom Publishing Protocol (AtomPub) and JSON, released by Microsoft under the Open Specification Promise so third-party and open-source projects can build clients and services for it. Documentation for V3 is now available.

The three new V3 demo services include simple read-only and read-write models for Products, Categories and Suppliers, and a read-only service that exposes the trusty Northwind database.

The new support for actions looks promising, providing for example, a Discount action for Products that takes a discountPercentage integer as a parameter and decreases the price of the product by that percentage, as shown on the demo services page.

But I decided to quickly try out something a little simpler just as a proof of concept: the new "any" and "all" operators. They allow tacking onto URLs filters such as this example shown on the demo services page:

http://services.odata.org/V3/OData/OData.svc/Categories?$filter=Products/any(p: p/Rating ge 4)

As WCF Data Services supports LINQ, I experimented with the "any" and "all" operators in a LINQ query via a Visual Studio project, using the MSDN Library Quickstart here.

I changed this query:

var ordersQuery = from o in context.Orders.Expand("Order_Details")
where o.Customer.CustomerID == customerId
select o;

to this query (note the use of the "All" operator):

var ordersQuery = context.Orders.Expand("Order_Details")
.Where(c =>
c.Order_Details.All (p =>
p.Quantity > 50));

Sure enough, this query didn't work with the old Northwind service, but it worked after simply inserting "V3" into the service URL so it looks like:

http://services.odata.org/V3/Northwind/Northwind.svc/

Fig. 1 shows the result of my efforts, in a WPF application showing customer orders with a quantity of more than 50.

A WPF app successfully pulls Northwind orders via a LINQ query using the new Any operator.

Figure 1. A WPF app successfully pulls Northwind orders via a LINQ query using the new "Any" operator. (Click image to view larger version. [In the original article])

Without the "V3" in the service URL, though, you get an error message (see Fig. 2).

Not using the new V3 OData service results in an error.

Figure 2. Not using the new V3 OData service results in an error. (Click image to view larger version.)

I recommend the Quickstart as an easy way to experiment with the new OData features, but you have to plug them in yourself because it doesn't use them, though it does require WCF Data Services 5. The completed project files are available if you don't want to go through the whole process of creating each project and just plug new feature functionality into them, as I did.

More improvements may be coming soon, as WCF Data Services, as the team is now using "semantic versioning" and NuGet, as have other products, such as Entity Framework. One reader asked about support for "Join," while Microsoft's Glenn Gailey has a list of improvements he'd like to see, including support for enums, client JSON support, functions and more (note that this wish list is included in a post of his favorite things that did make it into the new versions).


The Astoria Team announced the availability of NuGet and Bin Deploy on 5/17/2012:

imageWe recently posted about trying to release WCF Data Services more frequently, and some of the changes we’re making. In this post, we’ll take a slightly deeper look at NuGet,bin deploy, and where WCF Data Services is headed.

Managing dependencies with NuGet

If you’re already familiar with NuGet, you probably understand its value proposition and can skip ahead to bin deploying applications. If you haven’t used NuGet before, this section will provide you with a quick introduction. The NuGet site has a great documentation section that provides significantly more detail.

What is NuGet?

Languages such as Ruby and Python have package management systems that make it trivial to take a dependency on a centrally published package. NuGet provides similar functionality for .NET. With a few clicks or a simple command a developer can take a dependency on Microsoft.Data.Services.Client, the assembly that provides WCF Data Services client functionality. Taking a dependency on that package adds three other packages to your project: Microsoft.Data.OData, Microsoft.Data.Edm and System.Spatial.

Getting a specific version of WCF Data Services is also very easy: issuing the command Install-Package Microsoft.Data.Services.Client –Version 5.0.1 will install version 5.0.1 of the client, even if there is a more recent version of the client published. (Issuing the command Install-Package Microsoft.Data.Services.Client or installing the package with the explorer will always get the most recent version of the client.)

Installation experience with Manage NuGet Packages dialog:

image

image

image

Installation experience with Package Manager Console:

image

Sample installation commands:

  • Install-Package Microsoft.Data.Services.Client: Installs the most recent released version of the WCF Data Services client.
  • Install-Package Microsoft.Data.Services.Client –Pre: Installs the most recent version of the WCF Data Services client (including prereleases).
  • Install-Package Microsoft.Data.Services.Client –Version 5.0.0.50403: Installs an explicit version of WCF Data Services client, ignoring more recent versions.

Updating Packages with NuGet

In addition to the simple installation process, NuGet makes it easy to update packages.

Update experience with Manage NuGet Packages dialog:

image

Update experience with Package Manager Console:

image

Sample update commands:

  • Update-Package: Updates all packages to their most recently released version.
  • Update-Package –Pre: Updates all packages to their most recent versions (including prereleases).
  • Update-Package Microsoft.Data.Services.Client: Updates the Microsoft.Data.Services.Client package to its most recently released version, including any dependency updates required by the Microsoft.Data.Services.Client package.

Simplified Dependency Management

In summary, NuGet greatly simplifies the process of locating, downloading, and adding references to the most recent version of WCF Data Services. Developers no longer need to search the Download Center for a version of WCF Data Services, sort by date, download and install an MSI, manually copy the DLLs and add a reference to them, etc. With NuGet developers simply use the Manage NuGet Packages dialog or the Package Manager Console to easily take a dependency on WCF Data Services.

Bin Deploying Applications

Bin deployment is a term commonly used for a deployment process wherein the contents of the bin folder (potentially with a few other files) are copied to a server using a very simple file copy or similar process. In other words, bin deploy scenarios do not require an MSI to be run on the server that will host the application. This is especially key in scenarios where the server is not controlled by the developer (e.g., Web hosting). However, bin deploy is also beneficial in many other scenarios. Some of the benefits of bin deploy are as follows:

  • Simple file copy deployment using SMB, FTP, or some similar alternative.
  • Assemblies can be configured to run in medium trust (GACed assemblies always run in full trust).
  • No need to have an operations team run an MSI to get the latest version of an assembly.
  • Easier to automate deployments.

Bin deployment is also a natural companion for NuGet, semantic versioning, and our amended EULA for the imminent 5.0.1 version of WCF Data Services, which will include a redistribution clause.

A Glimpse Into the Future

Down the road, we will be taking a much deeper dependency on NuGet – we are currently revamping our tooling to install NuGet packages when you add a WCF Data Service or a reference to an OData service to your application.


<Return to section navigation list>

Windows Azure Service Bus, Access Control, Identity and Workflow

imageNo significant articles today.


<Return to section navigation list>

Windows Azure VM Role, Virtual Network, Connect, RDP and CDN

imageNo significant articles today.


<Return to section navigation list>

Live Windows Azure Apps, APIs, Tools and Test Harnesses

Richard Conway (@azurecoder) posted Some updates to [Windows Azure] fluent management on 5/19/2012:

imageIt’s been a while since I’ve done any posts on fluent management. This is taking place in the background and I’m using one of our projects to drive the development of this. I’ve back in a lot of changes and I realise now that if I’m going to get people to use this then I need to set up a wiki!

imageRecently I had a pingback from Michael Collier from Neudesic who looked at the library and saw that there was nothing in it to pick up state changes in roles. This is the kind of feedback I’d like. I have backed in a notification interface now and have a wrapper called ServiceSystemWatcher which will ping on the state change. As the lib gets more complicated now I’m forced to use config and defaults for many of the properties.

There have been a whole heap of changes which allow config to be injected in prior to deployment and that wrap up .cscfg files so have a nice way to add and remove settings. This proved mandatory for us on the current project we’re undertaking which needs the use config-driven settings for plugins we’re writing at runtime.

One of the key aspects which will be added in the next release is the idea of workflow. I want to be able to add a storage account, do something with it, add a database, add a hosted service, create a service bus namespace and queue etc. in a single transaction and rollback if this fails at any point. I’ve added some context interfaces to the lib which should enable this fairly easily now.

Someone recently asked whether the lib was published under an open source license. Yes, it is under a GNU lesser license. Currently it’s in beta so we won’t take repsonsibility for it if you use it in production and it fails. We’re going to offer a manadatory support contract going forward if you turnover more than $5m/year. This is mainly because a lot of consultancies that we’ve worked with don’t put anything back in the ecosystem and we don’t want them to profit from our labour and maintenance if they’re not helping the community. This has been our general experience with the user group – not a lot of support from the “partner” “community” – two terms I use very loosely.

For background about the Windows Azure Fluent Management library, see Release of Azure Fluent Management v0.1 library of 3/26/2012 and Richard’s later posts.


Wely Lau (@wely_live) answered Managing session state in Windows Azure: What are the options? on 5/18/2012:

imageOne of the most common questions in developing ASP.NET applications on Windows Azure is how to manage session state. The intention of this article is to discuss several options to manage session state for ASP.NET applications in Windows Azure.

What is session state?

Session state is usually used to store and retrieve values for a user across ASP.NET pages in a web application. There are four available modes to store session values in ASP.NET:

  1. In-Proc, which stores session state in the individual web server’s memory. This is the default option if a particular mode is not explicitly specified.
  2. State Server, which stores session state in another process, called ASP.NET state service.
  3. SQL Server, which stores session state in a SQL Server database
  4. Custom, which lets you choose a custom storage provider.

You can get more information about ASP.NET session state here.

In-Proc session mode does not work in Windows Azure

The In-Proc option, which uses an individual web server’s memory, does not work well in Windows Azure. This may be applicable for those of you who host your application in a multi-instance web-farm environment; Windows Azure load balancer uses round-robin allocation across multi-instances.

For example: you have three instances (A, B, and C) of a Web Role. The first time a page is requested, the load balancer will allocate instance A to handle your request. However, there’s no guarantee that instance A will always handle subsequent requests. Similarly,the value that you set in instance A’s memory can’t be accessed by other instances.

The following picture illustrates how session state works in multi-instances behind the load balancer.

Figure 1 – WAPTK BuildingASP.NETApps.pptx Slide 10

The other options
1. Table Storage

Table Storage Provider is a subset of the Windows Azure ASP.NET Providers written by the Windows Azure team. The Table Storage Session Provider is,in fact, a custom provider that is compiled into a class library (.dll file), enabling developers to store session state inside Windows Azure Table Storage.

The way it actually works is to store each session as a record in Table Storage. Each record will have an expired column that describe the expired time of each session if there’s no interaction from the user.

The advantage of Table Storage Session Provider is its relatively low cost: $0.14 per GB per month for storage capacity and $0.01 per 10,000 storage transactions. Nonetheless, according to my own experience, one of the notable disadvantages of Table Storage Session Provider is that it may not perform as fast as the other options discussed below.

The following code snippet should be applied in web.config when using Table Storage Session Provider.

<sessionState mode="Custom" customProvider="TableStorageSessionStateProvider">
<providers>
<clear/>
<add name="TableStorageSessionStateProvider" type="Microsoft.Samples.ServiceHosting.AspProviders.TableStorageSessionStateProvider" />
</providers> </sessionState>

You can get more detail on using Table Storage Session Provider step-by-step here.

2. SQL Azure

As SQL Azure is essentially a subset of SQL Server, SQL Azure can also be used as storage for session state. With just a few modifications, SQL Azure Session Provider can be derived from SQL Server Session Provider.

You will need to apply the following code snippet in web.config when using SQL Azure Session Provider:

<sessionState mode="SQLServer"
sqlConnectionString="Server=tcp:[serverName].database.windows.net;Database=myDataBase;User ID=[LoginForDb]@[serverName];Password=[password];Trusted_Connection=False;Encrypt=True;"
cookieless="false" timeout="20" allowCustomSqlDatabase="true"
/>

For the detail on how to use SQL Azure Session Provider, you can either:

The advantage of using SQL Azure as session provider is that it’s cost effective, especially when you have an existing SQL Azure database. Although it performs better than Table Storage Session Provider in most cases, it requires you to clean the expired session manually by calling the DeleteExpiredSessions stored procedure. Another drawback of using SQL Azure as session provider is that Microsoft does not provide any official support for this.

3. Windows Azure Caching

Windows Azure Caching is probably the most preferable option available today. It provides a high-performance, in-memory, distributed caching service. The Windows Azure session state provider is an out-of-process storage mechanism for ASP.NET applications. As we all know, accessing RAM is very much faster than accessing disk, so Windows Azure Caching obviously provides the highest performance access of all the available options.

Windows Azure Caching also comes with a .NET API that enables developers to easily interact with the Caching Service. You should apply the following code snippet in web.config when using Cache Session Provider:

<sessionState mode="Custom" customProvider="AzureCacheSessionStoreProvider">   <providers>     <add name="AzureCacheSessionStoreProvider"           type="Microsoft.Web.DistributedCache.DistributedCacheSessionStateStoreProvider, Microsoft.Web.DistributedCache"           cacheName="default" useBlobMode="true" dataCacheClientName="default" />   </providers>
</sessionState>

A step-by-step tutorial for using Caching Service as session provider can be found here.

Other than providing high performance access, another advantage about Windows Azure Caching is that it’s officially supported by Microsoft. Despite its advantages, the charge of Windows Azure Caching is relatively high, starting from $45 per month for 128 MB, all the way up to $325 per month for 4 GB.

Conclusion

I haven’t discussed all the available options for managing session state in Windows Azure, but the three I have discussed are the most popular options out there, and the ones that most people are considering using.

Windows Azure Caching remains the recommended option, despite its cons but developers and architects shouldn’t be afraid to decide on a different option, if it’s more suitable for them in a given scenario.

This post was also published at A Cloud[y] Place blog.

Full disclosure: I’m a paid contributor to the ACloudyPlace blog.


<Return to section navigation list>

Visual Studio LightSwitch and Entity Framework 4.1+

Jay Schmelzer posted Visual Studio 11 Product Line-up Announced on 5/18/2012:

imageToday the Visual Studio 11 product line-up was announced on the Visual Studio Team blog. Part of this announcement was information on what editions will support LightSwitch development.

Launched last year as an out-of-band release, I’m excited to announce that LightSwitch is now a core part of the Visual Studio product line! LightSwitch will be available through Visual Studio 11 Professional, Premium and Ultimate. With this integration, Visual Studio now provides a comprehensive solution for developers of all skill levels to build line-of-business applications and data services quickly and easily for the desktop and cloud.

I am particularly excited about the additional tools for data application development that will be available to you. In addition, with the new data services (OData) support in LightSwitch, you will be able to build additional clients using the broad set of project templates now included in these editions including Windows 8 Metro style apps.

image

LightSwitch will be retired from sale as a standalone product with the release of Visual Studio 11. If you acquire Visual Studio Professional, Premium, or Ultimate you will also get the LightSwitch development experience included in the box. We previously announced a price reduction for Visual Studio 2010 Professional to align it with the planned pricing for Visual Studio 11 making this an even more exciting offer. For more information on Visual Studio pricing please see: http://www.microsoft.com/visualstudio/11/en-us/products/pricing. [Emphasis added.]

Bringing LightSwitch and Visual Studio closer together was a natural choice. LightSwitch is a valuable tool offering a wide variety of developers the ability to quickly and easily build line of business applications. By including LightSwitch into the core Visual Studio 11 product line we are able to more fully integrate the products, making both stronger and offering additional value to developers of all skill levels.

Jay is Principal Director Program Manager of the LightSwitch Team.


Keith Craigo described Visual Studio LightSwitch Deployment GOTCHA's in a 5/18/2012 post:

imageOver the past several days I've been deploying two Visual Studio LightSwitch 2011 applications to my company's intranet server. I may revise this for Visual Studio 11 at a later time.

The 1st application was a packaged application, meaning I developed it on my personal computer zipped it and then gave it to my employers admin to install (namely Me, I'm the admin)

image_thumb1The 2nd application was deployed directly to the server from my work laptop. Needless to say neither went as smooth as I had hoped but I did learn some valuable lessons along the way.

NOTE: In this article I'm going to talk about what I did to deploy the 2nd application, a "WEB application", directly to the server, I'll be pointing out some issues I came across. The environment is restricted access to the application using Windows Authentication on my company's Intranet server. I'll talk about deploying a package at a little later time.

Disclaimer: Every scenario will not be discussed here, I can only comment on what worked best for me. I'm not a server administrator so please keep reading with that in mind.

I hope you find what I've learned useful.

Guides

First and foremost keep these two references handy:

  1. Beth Massi's Deployment Guide: How to Configure a Web Server to Host LightSwitch ApplicationsShe goes over in detail on how to setup your server.
  2. Eric Erhardt's Diagnosing Problems in a Deployed 3-Tier LightSwitch Application (Eric Erhardt)

Also I highly recommend installing Fiddler2 to debug your browsers traffic. It's proved to be valuable resource for me.

If using DevExpress XtraReports

The DevExpress XtraReports Suite is fantastic, but I've come across some GOTCHAS when it comes to deploying.

1st you need to ensure that under the Server and Server Generated - References folders that all the DevExpress libraries property Copy Local is set to true. Thank you Supreet Tare for this tip.

One of the things that kept happening to me was that for some reason when I did a Release build some of the libraries were changing back to false (will troubleshoot that later).

If you receive the error message "Load operation failed for query 'GetAuthenticationInfo'" on your deployed application, this message is really misleading a lot of times so check that the libraries have not been set to false.

GOTCHA Update 5/18/12 - XtraReports running under Authentication mode ASP.Net Impersonation is not supported at this time, this prevents XtraReports from accessing the database.

DevExpress has provided a workaround on their Support Forum (90% down the page) http://www.devexpress.com/Support/Center/Question/Details/Q385592

Deployment Checklist

ON THE APPLICATION HOST

I followed Beth Massi's Deployment Guide: How to Configure a Web Server to Host LightSwitch Applications to install all the LightSwitch pre requisites on the server.

Ask your server admin to assist with these if you don't have access:

  1. Install the Windows Platform Installer on the server as well as on your development machine - makes life a lot easier, trust me.
  2. Install the Web Deployment tool on both the server and development machine I used version 1.1
  3. Get the credentials of an account that can connect to the server and has appropriate permissions to install your app - i.e. server admin.
  4. Get the credentials of the SQL Server user that will perform read / write functions to the database on the behalf of your users -Publishing Wizard-Database Connection-Specify the user connection: is where this goes. Or you can click the Create Database Login button and LightSwitch will take care of this.
  5. Get the credentials of the SQL Server Application Administrator - this is the user that will be allowed to create roles, assign privileges to roles, create user accounts and assign users to roles for your application. Or you can select the Yes, create the Application Administrator at this time radio button, fill in the details and LightSwitch will take care of this.
  6. I created ahead of time an APPPOOL with .NET 4.0 and Integrated management. I'm going to assign my application to this APPPOOL.
  7. If supporting multiple users a maintenance time saver is to create a host server group and add all your users to this group, Once the application database is created, add the group to the database server logins and assign db_dataread and db_datawrite privileges for your applications database to this group. Now everyone is updated in one place Woo!Hoo!
    NOTE: I will update this once I get into controlling access through Active Directory groups.

ON THE DEVELOPMENT MACHINE

Manually Copy your database files to the server.

Open SQL Server Express Management Studio right click your database and select Tasks-Generate Scripts

Then make sure that the radio button for "Script entire database and all database objects" is selected. PLEASE NOTE: This tool does not script any Triggers - this was a GOTCHA. But we can remedy this by copying and pasting the trigger into the appropriate location on the production server.

Save the output to a convenient location then copy the zip file over to the production server.

Switch to the Production Server

Open the zip file you just copied in the SQL Server Management Studio and execute the scripts to create the production database, then create the appropriate triggers on your tables.

Once the database is set up go to the SQL Server Security folder and right click the user you created in step 4 above and select properties

Select User Mappings and then the database you just created

Assign aspnet_Memebership_BasicAccess, aspnet_Memebership_FullAccess, aspnet_Memebership_ReportingAccess, aspnet_Roles_BasicAccess, aspnet_Roles_FullAccess, aspnet_Roles_ReportingAccess, db_datareader, and db_datawriter to this account. Public is selected by default. Click OK

Open the database Security-Users folder and you will see that this user has been added.

Switch back to your development machine

Publishing: In the Publishing Wizard Dialog Box under DataBase Connections and Other Connections, change the Data Source to the Server that will be hosting the application and the user credentials to the appropriate users you specified above.
In Other Connections, for DevEpress.XtraReports and my WCFRia Services I put bin for location.

Switch to the Production Server

Open IIS Manager, if icon is not available, click Start and in Search Programs and files box, type in inetmgr.

Select your new virtual directory then in the Actions panel Select Basic Settings...
In the dialog box select the Select Button and then choose the APPPOOL you created in step 6 above. Click OK and OK again.

Make sure you’re on the /project Home page.

NOTE: If you need to Trace the application please refer to
Eric Erhardt's Diagnosing Problems in a Deployed 3-Tier LightSwitch Application (Eric Erhardt).

Under IIS select Authentication, again you can just double click this icon or in the Actions panel select Open Feature.

I enabled Windows Authentication and ASP.NET Impersonation, all other settings are disabled.

Ok this is one other GOTCHA I ran across.

Revision 5/17/12 - You can set these settings before deploying, in Visual Studio - Select your projects top node and click File View then select All Files, navigate to the ServerGenerated folder and open the Web.config file and make your changes.

With the project selected, in the Actions panel select Explore open your web.config file by right clicking and choosing Edit(you may have to give yourself read / write privs first).

Scroll down to your connection strings, make sure that all of the data sources are set to the production server and not the development machine, I had an issue where one of my WCFRia Services still pointed to my development machine.

Save your changes and close this file, you may even want to revoke the write privs for your account.

Now if you have Fiddler installed you may want to run this first, but it's not required.

If you enabled Tracing, just navigate to http://{your server}/{your project}/trace.axd

Back on the /project Home page, in the Actions pane under Manage Application click Browse *.80(http) or Browse *.443(https) if your running under SSL.

I chose the Browse *.80(http) link, run your application step through your screens, give it a thorough thrashing to make sure everything is working as it should from the server.

Now all I have to do is troubleshoot an issue with one of my WCFRia Services, hopefully by the end of today I'll have that resolved.

Once resolved I can re-publish to the server and I'll invite my co-workers to test out my application and once everything has been tested, debugged and if necessary fixed and tested again. I'll be ready to release it to my users.


Return to section navigation list>

Windows Azure Infrastructure and DevOps

John Treadway (@CloudBzz) recommended RACI and PaaS – A Change in Operations in a 5/18/2012 post:

imageI have been having a great debate with one of my colleagues about the changing role of the IT operations (aka “I&O”) function in the context of PaaS. Nobody debates that I&O is responsible and accountable for infrastructure operations.

Application developers (with or without the blessing of Enterprise Architecture) select platform components such as application servers, middleware etc. I&O keeps the servers running – probably up to the operating system. The app owners then manage their apps and the platform components. I&O has no SLAs on the platform, etc.

In the PaaS era, I think this needs to change. IT Operations (I&O) needs to have full accountability and responsibility for the OPERATION of the PaaS layer. PaaS is no longer a part of the application, but is now really part of the core platform operated by IT. It’s about 24×7 monitoring, support, etc. and generally this is a task that I&O is ultimately best able to handle.

Both teams need to be accountable and responsible for the definition of the PaaS layer to ensure it meets the right business and operational needs. But when it comes to operations, I&O now takes charge.

The implication of this will be a need for PaaS operations and administration skills in the I&O business. It also means that the developers and application ownership teams need only worry about the application itself – and not the standard plumbing that supports it.

Result? Better reliability of the application AND better agility and productivity in development. That’s a win, right?


Michael D. Dunn described Running Mission Critical Solutions on Windows Azure in a 5/17/2012 post to his MSDN blog:

imageI wanted to follow up on my previous post for Common Tip and post something that covers how Upgrades work, how to achieve maximum availability and scale along with deployment and monitoring recommendations.

imageWith Windows Azure there are a lot of options on how to manage and implement deployments, upgrades, scale and availability. Mission Critical applications require more effort and planning on which Azure features to leverage based on its availability requirements.

Understanding Updates & Upgrades

There are three types of Update events that can occur on the Windows Azure Platform.

  • · Application Upgrade
  • · Guest OS Update
  • · Host OS Update

An Application Upgrade occurs when you do an in-place upgrade on your application, options for this is covered in the Deployment section.

The Guest OS Version is controlled via the Azure Service Configuration by setting osFamily and the osVersion attributes. The osFamily attribute currently can be a 1, Windows Server 2008, or a 2, Windows Server 2008 R2. The osVersion controls a group of OS patches and updates for a given time, by setting the osVersion to “*” instead of an explicit version, this will install OS patches automatically. For Mission Critical applications you will have to determine if having the latest patches, including security vulnerabilities installed automatically is more of risk to your solution, than manually performing the upgrade. There are two ways to perform this update manually either through the Service Configuration or through “Configure OS” within the Management Portal.

The Host OS Update occurs one the Host system. Windows Azure tenants have no control as to when this update happens. To ensure the 99.95% SLA, Windows Azure will not update all hosts at once but perform host updates on host on different Fault domains, which is why you must have at least 2 instances for each role for the guaranteed 99.95% SLA. The Scalability & Elasticity section goes into more detail on how you can ensure capacity during these updates.

Increasing Scalability & Elasticity

A lot of times it is easy to choose the largest instance possible, surely adding more memory or CPU will increase the performance of your application? The problem is that most application are not written specifically to leverage multiple CPU cores and I have yet see an application that actually needed 14 GB of memory that the Extra Large VM provides.

Which is better: 4 Mediums or 2 large instances?

I would argue that 4 Mediums would be “better” as it would give you elasticity to increase OR decrease the number of instances based on my load at any given time, yet reducing cost by not over provisioning resources. For example if you have 2 Large Instances – You couldn’t scale down without sacrificing the 99.95% SLA provided for roles that have at least 2 instances and you couldn’t scale up without paying for an entire new Extra Large VM.

Also by choosing the medium instances over the large instances, it would allow you to increase the number update domains from 2 to 4, this allows you to have higher capacity availability during Host OS updates, Guest OS updates and Application updates. For example if a Host OS update occurs, which you cannot control when it happens, and you were using 2 Large instances, during this time frame your solution could only handle 50% of your maximum capacity. On the other hand if you had used 4 Medium instances you could choose to create 4 update domains, one for each instance, which means during any update your solution, would be able to handle 75% of your maximum capacity.

What is your solutions tolerance for reduced capacity? Maybe 25%-50% reduced capacity for a short period is acceptable, if not what are your options?

The simplest option is to create an additional update domain that runs the same number of instances as all your other domains. In the example above of using 4 Mediums instances, this would mean running a 5th update domain with 1 additional medium instance.

Another option would be to consider auto scaling or reducing feature availability for features that are CPU and/or memory intensive. The Enterprise Library Integration pack for Windows Azure includes WASABi , Windows Azure Autoscaling Application Block. This application block allows you to create rules to scale up/down and to reduce resource intensive features on the fly. Developers can download this application block from: http://www.microsoft.com/en-us/download/details.aspx?id=28189

Increasing Availability with Geo-Redundancy

With two fault domains in a single datacenter and multiple upgrade domains, you are setup nicely in the event of an inner data center failure. What about a complete datacenter failure? As with an inner datacenter failure, you have to decide what is your tolerance for a complete datacenter failure? If your tolerance is low, then you should consider deploying your application into two data center and if you are not using auto scaling type features – this means running a complete duplicate of all of your instances in a secondary data center to run at 100% capacity you had prior to the failure. Once you have your solution deployed into two data centers, you can leverage the Traffic Manager feature to create a new ‘Failover’ policy for your secondary deployment.

It sounds simple right? Well there are other considerations when planning to have a complete failover, such as are you leveraging any other Windows Azure features, such as storage, Service Bus, Cache or Access Control? Since these services are datacenter dependent in the event of a failure, these services may not be available.

This is not a simple task and will take additional development and planning. While I’ve geared this section to be towards for a complete data center failure, this could be leveraged for a specific service failure. For example, if your solution leverages Table Storage, if the storage service for the data center your application has a failure, your Windows Azure Compute instances will be running, but any features leveraging storage will not be available and depending on the feature this could be a critical feature of the application.

Deployment Process Recommendations

With Windows Azure there are multiple options for deploying applications. You can use Visual Studio, the management portal, or something custom using the Management Service API. There isn’t one way that is better than the other, but you should deploy your application in a consistent method across multiple environments: Development, Testing, Staging and Production.

If it is possible within your team or organization, you should consider automated deployments. This option takes additional time, but allows you to deploy consistently. Automated deployments can be achieved in various ways, but currently require a custom development effort. The largest benefit to automated deployments is for operations teams to be able to rapidly re-deploy services in other data centers in of a catastrophic data center failure. For thoughts around how an automated deployment could be implemented, read the “Automating Deployment and using Windows Azure Storage” in the Moving Applications to the Cloud, written by Microsoft’s Patterns and Practices team. http://msdn.microsoft.com/en-us/library/ff803365

If automate deployment isn’t feasible, you should consider leveraging Windows Azure PowerShell Cmdlets, available for download here: http://wappowershell.codeplex.com/

These PowerShell Cmdlets provides consistent management accessibility that conforms to other Microsoft products that are leveraging PowerShell as a management interface.

Another common question is “Should I use a VIP Swap or leverage the In Place Upgrade?” With the 1.5 Windows Azure SDK – In Place upgrades features are on-par with a VIP Swap. So what is the advantage of using a VIP Swap? The advantage is the ability to follow a process, to smoke test your application prior to it being published “live”. IF you were to use an in-place upgrade, once complete the application is “live”. Mistakes happen in IT, what if someone published the wrong version? If you used VIP swap, you could smoke test the application first and even after you perform the VIP swap, you could revert back to the previous version instantaneously!

Solution Monitoring

Even if you are using an auto scaling framework such as WASABi – Monitoring is an important aspect. Monitoring will assist in diagnosing issues within your application and knowing important key metrics about your application performance, errors and load. While Windows Azure can log Windows Events, Performance Counters and Logs to your storage accounts there is no “Out of the Box” Solution to view this data in a user friendly, graphical format.

System Center Operations Manager offers a Windows Azure management pack, which allows you to view alerts for Events and Performance metrics of your applications in a friendly manner.

The Windows Azure Management pack is available for download: http://www.microsoft.com/en-us/download/details.aspx?id=11324

Disaster Recovery

While it is true that Windows Azure does backup your Storage and SQL Azure data in triplicate, however this isn’t for you to leverage to restore your data on-demand. These backups are used by the Windows Azure teams in case of a catastrophic failure so they can restore an entire datacenter and not necessarily recover data for individual Windows Azure tenants. While this provides some relief, it is best to have backups of your application, configurations and data so that you can restore onto another Azure datacenter manually. While creating backups of Storage data would be a custom development effort, relational data stored on SQL Azure can be backed up and restored to a file using the Import/Export feature in the Windows Azure.


Mike Benkovich (@mbenko) continued his series with CloudTip #13-What do you need to know to get started? on 5/17/2012:

imageThere are many ways to learn a new technology. Some of us prefer to read books, others like videos or screencasts, still others will choose to go to a training style event. In any case you need to have a reason to want to learn, whether it's a new project, something to put on the resume or just the challenge because it sounds cool. For me I learn best when I've got a real project that will stretch my knowledge to apply it in a new way. It also helps to have a deadline.

imageI've been working for a while now for Microsoft in a role that allows me to help people explore what's new and possible with the new releases of technology coming out at a rapid pace from client and web technologies like ASP.NET and Phone to user interface techniques like Silverlight and Ajax, to server and cloud platforms like SQL Server and Azure. The job has forced me to be abreast of how the technologies work, what you can do with them, and understanding how to explain the reasons for why and how they might fit into a project.


Try Azure for 90 Days Free!


image

In this post I'd like to provide a quick tour of where you can find content and events on Cloud Computing that should help you get started and find answers along the way.

Part 1 - Get Started with Cloud Computing and Windows Azure.
You've heard the buzz, your boss might even have talked about it. In this first webcast of the Soup to Nuts series we'll get started with Windows Azure and Cloud Computing. In it we will explore what Azure is and isn't and get started by building our first Cloud application. Fasten your seatbelts, we're ready to get started with Cloud Computing and Windows Azure.
Video; WMVMP4 Audio; WMA Slides: PPTX

Part 2 - Windows Azure Compute Services
The Cloud provides us with a number of services including storage, compute, networking and more. In this second session we take a look at how roles define what a service is. Beyond the different flavors of roles we show the RoleEntryPoint interface, and how we can plug code in the startup operations to make it easy to scale up instances. We will show how the Service Definition defines the role and provides hooks for customizing it to run the way we need it to.
Video; WMVMP4 Audio; WMA Slides: PPTX

Part 3 - Windows Azure Storage Options
The Cloud provides a scalable environment for compute but it needs somewhere common to store data. In this webcast we look at Windows Azure Storage and explore how to use the various types available to us including Blobs, Tables and Queues. We look at how it is durable, highly available and secured so that we can build applications that are able to leverage its strengths.
Video; WMVMP4 Audio; WMA Slides: PPTX

Part 4 - Intro to SQL Azure
While Windows Azure Storage provides basic storage often we need to work with Relational Data. In this weeks webcast we dive into SQL Azure and see how it is similar and different from on-premise SQL Server. From connecting from rich client as well as web apps to the management tools available for creating schema and moving data between instances in the cloud and on site we show you how it's done.
Video; WMVMP4 Audio; WMA Slides: PPTX

Part 5 - Access Control Services and Cloud Identity
Who are you? How do we know? Can you prove it? Identity in the cloud presents us with the same and different challenges from identity in person. Access Control Services is a modern identity selector service that makes it easy to work with existing islands of identity such as Facebook, Yahoo and Google. It is based on standards and works with claims to provide your application with the information it needs to make informed authorization decisions. Join this webcast to see ACS in action and learn how to put it to work in your application today.
Slides: PPTX

Part 6 - Diagnostics & Troubleshootingx
So you've built your Cloud application and now something goes wrong. What now? This weeks webcast is focused on looking at the options available for gaining insight to be able to find and solve problems. From working with Intellitrace to capture a run history to profiling options to configuring the diagnostics agent we will show you how to diagnose and troubleshoot your application.

Part 7 - Get Started with Windows Azure Caching Services with Brian Hitney (http://bit.ly/btlod-77)
How can you get the most performance and scalability from platform as a service? In this webcast, we take a look at caching and how you can integrate it in your application. Caching provides a distributed, in-memory application cache service for Windows Azure that provides performance by reducing the work needed to return a requested page.

Part 8 - Get Started with SQL Azure Reporting Services with Mike Benkovich (http://bit.ly/btlod-78)
Microsoft SQL Azure Reporting lets you easily build reporting capabilities into your Windows Azure application. The reports can be accessed easily from the Windows Azure portal, through a web browser, or directly from applications. With the cloud at your service, there's no need to manage or maintain your own reporting infrastructure. Join us as we dive into SQL Azure Reporting and the tools that are available to design connected reports that operate against disparate data sources. We look at what's provided from Windows Azure to support reporting and the available deployment options. We also see how to use this technology to build scalable reporting applications

Part 9 - Get Started working with Service Bus with Jim O'Neil (http://bit.ly/btlod-79)
No man is an island, and no cloud application stands alone! Now that you've conquered the core services of web roles, worker roles, storage, and Microsoft SQL Azure, it's time to learn how to bridge applications within the cloud and between the cloud and on premises. This is where the Service Bus comes in-providing connectivity for Windows Communication Foundation and other endpoints even behind firewalls. With both relay and brokered messaging capabilities, you can provide application-to-application communication as well as durable, asynchronous publication/subscription semantics. Come to this webcast ready to participate from your own computer to see how this technology all comes together in real time.


<Return to section navigation list>

Windows Azure Platform Appliance (WAPA), Hyper-V and Private/Hybrid Clouds

Thomas Shinder, MD posted Private Cloud Architecture Sessions at TechEd North America 2012 to the Private Cloud architecture blog on 5/18/2012:

imageThis is a big year for Microsoft! We’re going to see the release a Windows Server 2012 and a slew of other fantastic products – and you’re going to want to learn about them all. The big thing about Windows Server 2012 is that it’s all about the cloud. Private cloud, public cloud, hybrid cloud and more. TechEd North America 2012 is going to be heavy into the cloud and so you want to get up to speed on all things cloud.

imageThis is where the Private Cloud Architecture track sessions come in. Before you start to envision and design your private cloud, you’ll need to make sure it’s built on a firm architectural foundation. A solid grounding in private cloud architecture will enable you to consider all the key design decisions that make the difference between a private cloud that propels your business to unprecedented levels of efficiency and productivity versus just another virtualized infrastructure.

imageSo here I present to you a list of private cloud architecture sessions. We’ve worked hard to make sure that each session builds on the previous – so that your journey to an exemplary private cloud goes right from the start and succeeds by design. I will be delivering several of these sessions and I hope to see you there. As an added incentive, I’ll be bringing a pile of books to give away! Come to one of my sessions and you’ll have the opportunity to win one of the books I’ve written – and I’ll autograph them at no additional charge (in case you want to put them on eBay).

You can find all of these sessions by going to your MyTechEd and searching for Architecture & Practices.

AAP304: Private Cloud Principles, Concepts, and Patterns

Speaker(s): Tom Shinder
Monday, June 11 at 3:00 PM - 4:15 PM in S230E (663)
Architecture & Practices | Breakout Session | 300 - Advanced

So you've heard a lot about the Private Cloud—but, what exactly is a Private Cloud? What are the principles, patterns, and concepts that drive a private cloud infrastructure? Attend this session to learn about business value, perception if infinite capacity, predictability, fabric management, fault domains, and many more private cloud architectural components that enable you to realize the true benefits of the private cloud. This is a cornerstone session that you must attend to understand how the private cloud differs from a traditional datacenter and how to architect the private cloud correctly.

AAP315: Optimize the Business with Microsoft Datacenter Services 2.0

Speaker(s): Ulrich Homann
Monday, June 11 at 4:45 PM - 6:00 PM in S331 (350)
Architecture & Practices | Breakout Session | 300 - Advanced

Learn all you need to know about the Microsoft Datacenter Services Solution (DCS) IaaS scenario in a private cloud deployment mode. Find out the new scenarios delivered by the Data Center Services Reference Implementation v2.0, how MCS and Premier can deliver an end-to-end solution including Microsoft Services architecture and implementation. Learn how DCS improves operational efficiency by implementing tools for provisioning, managing, monitoring, and optimizing data center hardware and virtual resources through integrated consoles to reduce the strain on IT staff.

AAP203: Private Cloud Design and Management: Speeding the Transition to a Responsive, Virtualized Storage Infrastructure

Speaker(s): Rob Griffin
Tuesday, June 12 at 10:15 AM - 11:30 AM in S210A (387)
Architecture & Practices | Breakout Session | 200 - Intermediate

Learn key principles to quickly transform your IT operations into a robust, private cloud. Return home with recommendations for achieving performance, availability, and flexibility through the virtualization of your storage infrastructure. Experience how IT organizations incorporate storage hypervisor functions to rapidly provision resources, dynamically distribute workloads across multiple tiers and types of storage, non-disruptively refresh hardware to expand capacity, while tapping into the latest and most cost-effective disk technologies. Learn how you can leverage and optimize Hyper-V, Clustered Shared Volumes, and Microsoft System Center in your virtualized storage environment. You’ll take away concrete and actionable steps considered standard practice by seasoned colleagues in the IT industry to help you build and enhance your private cloud infrastructure. The session includes a live demonstration and a discussion of various real-world customer scenarios.

AAP306: Private Cloud Security Architecture: A Solution for Private Cloud Security

Speaker(s): Tom Shinder, Yuri Diogenes
Tuesday, June 12 at 1:30 PM - 2:45 PM in N320C (600) T-TH
Architecture & Practices | Breakout Session | 300 - Advanced

Cloud computing introduces new opportunities and new challenges. One of those challenges is how security is approached in the private cloud. While private cloud can share a lot of security issues with traditional datacenters, there are a number of key issues that set private cloud security apart from how security is done in the traditional datacenter. In this session, Dr. Tom Shinder and Yuri Diogenes discusses these issues and wrap them in to a comprehensive discussion on private cloud security architecture. By taking an architectural approach to private cloud security, you will be able to understand the critical concepts, principles and patterns that drive a successful security implementation of private cloud.

AAP201: Hybrid Computing Is the New Net Norm

Speaker(s): Heath Aubin
Wednesday, June 13 at 1:30 PM - 2:45 PM in S331 (350)
Architecture & Practices | Breakout Session | 200 - Intermediate

Microsoft Service recently gathered subject matter experts and architects actively consulting in the field to research the problem space of Hybrid IT. Hybrid IT is defined as the composition of two or more data center environments (traditional IT, private cloud, or public/host cloud) which are individually unique but are bound together to share or provide services and data. The team discovered that organizations do not set out with a goal of having a “Hybrid” environment. Rather, they are tasked with either going to the public cloud to take advantage of perceived cost reduction or they are looking to centralize services within a disconnected environment thus providing their own private cloud. Come to this session to learn about Microsoft Services strategy, experience, and best practices within this focus area and new term we call Hybrid IT.

AAP305: Modern Application Design: Cloud Patterns for Application Architects

Speaker(s): Ulrich Homann
Wednesday, June 13 at 3:15 PM - 4:30 PM in N320E (576)
Architecture & Practices | Breakout Session | 300 - Advanced

How does one build a “modern” application? In this session, learn about the key design patterns ranging from adaptive and insight-driven applications to 'Social'-enabled and aware application design, to Big Data: why HPC is coming to you…Architecture patterns and practices ranging from Enterprise Architecture to design and delivery.

AAP302: The Four Pillars of Identity: A Solution for Online Success

Speaker(s): Heath Aubin
Thursday, June 14 at 10:15 AM - 11:30 AM in S320C (628)
Architecture & Practices | Breakout Session | 300 - Advanced

Internet and cloud computing have revolutionized the way people expect to access and share information. To be successful, organizations need to develop a comprehensive identity and access strategy for providing services to diverse groups of people while assuring secure online access, respecting privacy, and complying with regulatory mandates. Today, there is no shortage of standards, products, or solutions which claim to solve all sorts of identity and security challenges. This session is about framing the Four Pillars of Identity (Administration, Authentication, Authorization, and Audit) as guiding principles when designing a robust and effective identity solution. In this session, we cover key industry trends within the identity space, discuss key aspects of an identity strategy and provide you with guidance and Microsoft Services solution examples which will help you as you begin planning your identity architecture.

AAP307: The Private Cloud: Building up a Self-Service Catalog

Speaker(s): Mike Resseler
Thursday, June 14 at 10:15 AM - 11:30 AM in S320E (280/556)
Architecture & Practices | Breakout Session | 300 - Advanced

Self-Service is one of the key components in the Private Cloud concept. But how do you start with this? What are you going to offer? Who needs to be involved in such a project? This session is all about the Self-Service layer and guides you through the creating of real-life examples to offer services to your customers.

AAP303: Sustainability in Infrastructure Design

Speaker(s): Jeff Stokes
Thursday, June 14 at 1:00 PM - 2:15 PM in N210 (848)
Architecture & Practices | Breakout Session | 300 - Advanced

This session discusses the importance of considering sustainability and supportability needs in IT Architecture projects. Topics to be explored include monitoring, scaling for performance and stability, management of servicing machines en masse in the data center and beyond.


Edwin Yuen (@edwinyuen) posted Top 5 Reasons to Choose Hyper-V for your Virtualization and Private Cloud Infrastructure on 5/17/2012:

imageIn my last post, I provided some facts related to Microsoft’s virtualization and private cloud offerings to help ensure that our customers and partners have accurate information as they’re exploring technology options and making investments in their infrastructure. Some of these facts are related to providing customers with a scalable and economical private cloud solution that will grow with them and their businesses without imposing taxes on that growth. Other aspects of the post touch on the technical capabilities Microsoft offers, such as deep physical and virtual application insight and the broad extensibility of System Center to work with third party and custom solutions.

imageToday I wanted to explore some of the top reasons for choosing Windows Server 2008 R2 SP1 with Hyper-V, which we’ve also compiled into a brief whitepaper you can find at this link. These top reasons include:

1. Tremendous Industry Recognition & Strong Adoption by Customers

Hyper-V has experienced strong market growth since its introduction with market share expanding 409% (from 4.9% to 24.8%) from calendar Q3 2008 to Q3 2011, as noted in the IDC WW Quarterly Server Virtualization Tracker from December 2011. Hyper-v also been positioned by top industry analysts as a Leader in x86 Server Virtualization Infrastructure.

Customers, including Target, CH2M HILL and many others, are running their businesses on Hyper-V. You can find details on these customers and more through the broad portfolio of case studies available at this link.

2. Built-in Virtualization & Familiarity with Windows

Hyper-V exists in 2 variations; as a free standalone product called Microsoft Hyper-V Server 2008 R2 SP1, and as an installable role in Windows Server 2008 R2 SP1. Since Hyper-V is an integral part of Windows Server 2008 R2 SP1, it provides great value by enabling IT Professionals to continue to utilize their familiarity with Windows while minimizing the learning curve.

Since its launch in 2008, we have continued to evolve Hyper-V to deliver even more capabilities, increase scalability and enhance performance. Recent feature additions include Dynamic Memory, so customers can better utilize the memory resources of Hyper-V hosts by balancing how memory is distributed between running virtual machines. Another is Live Migration so customers with multiple Hyper-V physical hosts in their datacenters can easily and rapidly move running
virtual machines, VMs, to the best physical computer for performance, scaling, or optimal consolidation without affecting users; thereby reducing costs and increasing productivity.

With the upcoming release of Windows Server 2012 Hyper-V there are even more improvements to performance and scalability to help customers transform their organization’s cloud computing environment.

3. Hyper-V as the Best Choice for Virtualizing Microsoft Workloads

Microsoft has published third-party validated lab results that prove best-in-class performance for Microsoft workloads, including SharePoint 2010, Exchange Server 2010, and SQL Server 2008. Also available is recent data highlighting SQL Server 2012 running on Hyper-V with performance within 88% of a physical server and scaling to 2,090 OLTP transactions per second.

The results demonstrate that Hyper-V can provide enterprise ready support for tier-1 datacenter applications, while maintain high levels of scalability and performance.

4. Comprehensive Management Capabilities with System Center 2012

With System Center 2012, customers don’t have to settle for a cobbled together collection of management offerings (see the video in item #1 of this post), but instead have a highly integrated solution for managing across their entire environment including their physical, virtual, private and public cloud environments.

Some of the System Center 2012 differentiated capabilities include:

  • Support for multi-hypervisor management
  • Support for third party integration and process automation
  • Ability to manage applications via a single view across private clouds and Windows Azure
  • Deep application diagnostics and insight for Windows and .NET based environments
  • Technologies like Server Application Virtualization, which enable you to abstract your applications from the underlying private cloud infrastructure.

All of this is delivered in a unified solution which is available in the Datacenter edition license which covers unlimited virtual machines.

5. Per Physical Processor Pricing with Unlimited Virtualization Rights

The licensing advantages of the Microsoft virtualization and private cloud offerings over the competition were covered fairly extensively in my previous blog post. The simple way to understand the situation is that Microsoft private cloud solutions are licensed on a per processor basis with unlimited virtualization rights. Microsoft’s licensing allows you to get the cloud computing benefits of scale with unlimited virtualization and lower costs – consistently and predictably over time. The private cloud solutions from VMware charge you more as you grow and are licensed by either the number of VMs or the virtual memory allocated to those VMs.

You can find the details in this whitepaper as well as running server virtualization and private cloud calculations for yourself with these tools.

I hope the above information has provided you with some useful facts to help you while you’re investigating virtualization and private cloud infrastructures.


<Return to section navigation list>

Cloud Security and Governance

Dave Asprey (@daveasprey) posted Data in Motion: The Other side of the Cloud Encryption Coin to the Trend [Micro] Security Blog on 5/17/2012:

imageYou’ve probably seen some of my blog posts about the importance of encrypting data stored in the cloud and on servers in traditional data centers, but I write less about encrypting data in motion because most of us are probably thinking “That’s because we have SSL/TLS and IPSec; the problem is solved. “ The truth is that the problem is only kind of solved; using these things for clouds is a bit of a kluge because these technologies typically only protect network traffic to the edge of the cloud network, leaving traffic between servers within the cloud network unprotected.

imageTunnel-based solutions don’t really work that well in cloud networks due their point-to-point nature. Since “points” move around very quickly in clouds, point-to-point technologies can easily cause problems with scalability, management and performance in clouds.

I am intrigued at how Certes Networks approaches this problem. They just announced vCEP (Virtual Certes Enforcement Point). According to the release “the vCEP is a virtual appliance that allows organizations to protect sensitive network traffic among virtual servers and between clouds without using tunnels. It encrypts network traffic from IaaS cloud infrastructures to data centers across the WAN and from server to server within the cloud.”

Huh? Encrypting network traffic without tunnels? The OSI model must be quaking in its boots.

Although they’re not a household name, Certes Networks is a pioneer in network encryption, having deployed the first group encryption solution years ago. Group based encryption removes the need for point-to-point key negotiations, which in turn eliminates tunnels and makes data in motion encryption scalable and transparent to the infrastructure. Since cloud infrastructure is always shifting its configuration, transparent encryption is really important.

The other interesting thing here is that the Certes policy and key management system allows IaaS clients to maintain control of their own policies and keys. This really matters for regulated or sensitive workloads. It should also be a welcome development to cloud providers who can address client concerns about security without bearing the legal and administrative burden of owing (or having access to) their client policies and keys).

The availability of cloud security for data in transit fills the security gap between the client’s trusted network and the data protection offered by Trend Micro’s SecureCloud data-at-rest encryption. This doesn’t exactly spell the end of IPsec, but it does make it easier to cryptographically isolate your data and traffic from other cloud clients. And that’s a good thing!


<Return to section navigation list>

Cloud Computing Events

Goutama Bachtiar (@goudotmobi, pictured below) reported on 5/19/2012 Microsoft Corporate VP Hadba urges app designers to think of cloud design before anything else during his 5/15/2012 presentation to Indonesian startups:

imageA session held on the 15 May for Indonesian local startups, “Microsoft BizSpark Startup Night” witnessed the presence of a special guest speaker, Walid Abu-Hadba.

The Palestinian-born tech executive, who was also the main presenter, is the Redmond-based corporate vice president, developer and platform evangelism group at Microsoft.

imageIn his 40-minute talk, [Hadba, pictured at right,] shared about Microsoft Cloud Computing Platform“Windows Azure” and how he leads efforts in building vibrant solutions ecosystems through technical evangelism, community engagement and audience marketing.

Hadba[, pictured at right,] invoked a new perspective into audiences when he challenged app designers to think about cloud design first amongst other things during app design. Afterwhich, think about how and where to deploy.

“Design the app and cloud as if it will scale up unlimitedly. Never limit yourself or you will suffer. Microsoft offers public, private and hybrid cloud services because it sees the Asian market as very appealing due to its highest 67 percent adoption rates for cloud computing, which is comparatively higher to 61 percent and 57 percent adoption rate in the US and Europe respectively,” says Hadba.

imageThose who join the BizSpark program will be entitled to technology access such as necessary software, documentation and white papers, trainings in both technical and business aspects. Startups will also be listed in Microsoft marketing collaterals to be circulated all over the globe.

Another program introduced is BizSpark Plus. Targeting small and promising firms, it provides full hosting worth US$ 60,000 Windows Azure for free, instead of tech access, software (such as MS SQL Server, Visual Studio .NET), training and publication as mentioned earlier under the BizSpark program. Azure, an open and flexible cloud platform, proffers unlimited servers and storage so one can scale their applications to any size based on the needs.

Hadba also mentioned Windows Azure Store where developers can build, submit, and share their codes and apps with other developers. Another store is the Windows 8 Store (widely known as Windows Store) which is currently designed for Windows 8 Beta (also known as Consumer Preview.)

Although the long-awaited Windows Phone store will be available for Indonesians soon, no exact date as given as Microsoft Indonesia declined to disclose the information.

Combining developer platform, tools and ecosystems of the respective developer-centric company, the store which was first announced in September last year offers up to 80 percent revenue share for apps sold. Set the price in local currency and our piece will be available in more than 100 languages. The currency conversions and local tax laws will be handled by the store. Pricing model is flexible and rewards popular apps with a better percentage of the net receipts while delivery options include in-app purchases, trial versions, in-app adv, and third-party transaction services.

Lastly, Hadba highlighted Windows9.com as a site where one can find everything – videos, samples, design, codes, guides, roadmaps and tutorials – in order to develop a Windows 8 app.

The 3-hour event was wrapped up with showcases from Mobile Game Development Studio Nightspade, KOMPAS app for foodie SajianSedap, City lifestyle directory Urbanesia and a presentation by Agate Studio – all lead by Microsoft Indonesia Developer Evangelist Norman Sasono.


Himanshu Singh (@himanshuks) posted Windows Azure Community News Roundup (Edition #19) on 5/18/2012:

imageWelcome to the latest edition of our weekly roundup of the latest community-driven news, content and conversations about cloud computing and Windows Azure. Here are the highlights from this week.

Articles and Blog Posts

Upcoming Events, and User Group Meetings

North America

Europe

Other

Recent Windows Azure Forums Discussion Threads

Send us articles that you’d like us to highlight, or content of your own that you’d like to share. And let us know about any local events, groups or activities that you think we should tell the rest of the Windows Azure community about. You can use the comments section below, or talk to us on Twitter @WindowsAzure.


Brian Hitney reported Windows Azure Dev Camps Soon! in a 5/18/2012 post:

imageIt’s that time – Windows Azure Dev Camps are coming really soon. Here’s the schedule:

We’re pretty excited to mix up the format a little, with some time to jump into some new areas we haven’t typically talked about in our previous shows:

1. The Azure Platform – An Overview (60 minutes)

image

Let’s start off the day with a dive into Windows Azure. We’ll talk about what Windows Azure offers, from hosting applications to durable storage. We’ll look at Windows Azure roles types, hosting web applications and worker processes. We’ll also cover durable storage options, both traditional relational database that is offered as SQL Azure, or more cloud-centric offerings in Windows Azure Storage for files, semi-structured data, and queues.

2. Hands on @home with Azure (120 minutes)
For this hands-on portion of the day, we’ll work on the @home with Windows Azure project. The @home project will give you a solid understanding of using Windows Azure in a project that contributes back to Stanford’s Folding@home distributed computing project. We’ll walk through the code, provisioning an account, and getting the application deployed and running.

3. Caching – A Scalable Middle Tier (45 minutes)
Creating a stateless application is a difficult but fundamental aspect of building a scalable application in the cloud. In this session, we’ll talk about the Windows Azure Cache service and using it as a middle tier to maintain state and cache objects that can be shared by multiple instances.

4. SQL Azure, Data Sync, and Reporting (45 minutes)
SQL Azure offers a scalable database as a service without having to configure and maintain hardware. We’ll look at the subtle differences between on premises SQL Server databases and SQL Azure, and how Data Sync can be used to synchronize data between multiple databases both in the cloud and on premises. We’ll also look at SQL Azure Reporting.

5. Windows 8 and Azure – Better Together (60 minutes)
The consumer preview of Windows 8 is out, and it’s the perfect time to ramp up on developing native Metro-style applications. In this session, we’ll give an overview of Windows 8, and delivering a richer user experience by leveraging a cloud backend.


Apigee (@Apigee) announced on 5/18/2012 a Live Webcast: OData Introduction and Impact on API Design scheduled for 5/24/2012 at 11:00 AM PDT:

When: Thursday, May 24th, 11:00am PT / 2:00pm ET

imageWe're in a data-driven economy. Web API designers need to define what and how to expose data from a variety of apps, services, and stores. What are challenges of unlocking data and opening up access in a straightforward and standards-compliant manner? Is OData the right tool for the job?

Join Anant, Brian, and Greg for a discussion of OData, its API design implications, and the pros and cons of OData as an enabler of data integration and interoperability across Data APIs.

http://forms.apigee.com/acton/form/549/0045:d-0002/0/index.htm

If you can't make the live webcast, register and we'll send you a video recording with slides.

Join to Discuss »

  • OData, SQL, and the "RESTification" of data - providing a uniform way to expose, structure, query and manipulate data using REST principles.
  • Opportunity and challenges for OData.
  • The questions of Web standards and proprietary versus open tools and protocols.


Greg Brail
Anant Jhingran
Brian Pagano

Bruce Kyle suggested that you Join in the ‘Meet Windows Azure’ Event Online June 7 in a 5/18/2012 post:

imageThe Windows Azure team has announced a new event, MEET Windows Azure. Whether you’re just getting started or are a long time customer, you’ll definitely want to tune-in.

For details, see the announcement at Mark Your Calendar—MEET Windows Azure Event Streamed Online June 7th.

image

Hear from Scott Guthrie and other technology leaders about the latest cloud-based development technologies from Windows Azure.

Sign-up today at meetwindowsazure.com.

For more details, see my recent Social meet up on Twitter for Meet Windows Azure on June 7th and “Meet Windows Azure” Event Scheduled for 6/7/2012 in San Francisco posts.


<Return to section navigation list>

Other Cloud Computing Platforms and Services

Jeff Barr (@jeffbarr) reported Elastic Load Balancer - Console Updates and IPv6 Support for 2 Additional Regions in a 5/18/2012 post:

imageYou can now manage the listeners, SSL certificates, and SSL ciphers for an existing Elastic Load Balancer from within the AWS Management Console. This enhancement makes it even easier to get started with Elastic Load Balancing and simpler to maintain a highly available application using Elastic Load Balancing.

imageWhile this functionality has been available via the API and command line tools, many customers told us that it was critical to be able to use the AWS Console to manage these settings on an existing load balancer.

With this update, you can add a new listener with a front-end protocol/port and back-end protocol/port:

If the listener uses encryption (HTTPS or SSL listeners), then you can create or select the SSL certificate:

In addition to selecting or creating the certificate, you can now update the SSL protocols and ciphers presented to clients:

We have also expanded IPv6 support for Elastic Load Balancing to include the US West (Northern California) and US West (Oregon) regions.

Not sure if this rises to the level of “feature of the week,” but it’s close.


Jeff Barr (@jeffbarr) posted RDS Read Replicas in the Virtual Private Cloud on 5/17/2012:

imageYou can now launch Amazon Relational Database Service (RDS) MySQL Read Replicas inside of a Virtual Private Cloud (VPC).

Amazon RDS removes the headaches of running a relational database reliably at scale, allowing Amazon RDS customers to focus on innovation for their customers. Read Replicas enables you to elastically scale out beyond the capacity constraints of a single DB Instance for read-heavy database workloads.

imageYou can now create one or more replicas of a given “source” DB Instance and serve incoming read traffic from multiple copies of your data within a VPC environment. You can create a Read Replica with a few clicks of the AWS Management Console or using the CreateDBInstanceReadReplica API.

Amazon VPC allows you to customize the network configuration to closely resemble a traditional network that you might operate in your own datacenter. I described the process of launching a DB Instance inside of a VPC in an earlier post.

Amazon RDS in VPC enables you to have a DB instance within a private network powering a public web application. The DB instance is on a private subnet, which does not have a public IP address. You can also use Amazon RDS + VPC to run corporate applications that are not intended to be accessed from the Internet


<Return to section navigation list>

by Roger Jennings (--rj) (noreply@blogger.com) at May 19, 2012 02:58 PM

Accessing US Air Carrier Flight Delay DataSets on Windows Azure Marketplace DataMarket and “DataHub”

Contents:

The initial five months (October 2011 through February of US Air Carrier Flight Delays data curated from the U.S. Federal Aviation Administration’s (FAA) On_Time_On_Time_Performance.csv (sic) files is publicly available free of charge in OData and *.csv formats from OakLeaf Systems’ Windows Azure Marketplace DataMarket and Codename “Data Hub” preview sites.

Accessing these datasets, which originate from an On_Line_Performance table of the same SQL Azure server instance, for the first time isn’t an altogether intuitive process, so the following two sections describe how to open the datasets with the DataMarkets’ Data Explorer feature and execute queries against them with the Explorer’s Query Builder and the LINQPad application.

• Updated 5/16/2012 1:30 PM PDT by adding Exporting Data to Excel PowerPivot Tables and Charts section.

• Updated 5/15/2012 12:45 PM PDT with sample C# expressions for a basic C# LINQ query executed in LINQ pad and customizing the query projection, as well as fixing a heading typo.


Windows Azure Marketplace DataMarket

1. Navigate to the DataMarket’s home page and, if you have a DataMarket Account, click the Data menu link:

image

2. On the Data page, click the US Air Carrier Flight Delays link:

image

2A. Sign in with your Windows Live ID. If you don’t have a Windows Azure Marketplace Datamarket account, complete the registration form:

image

Note: You don’t need to agree to Microsoft using your email address to continue with registration.

2B. Click Continue to open Microsoft’s terms of use, scroll to the bottom, and mark the I Accept the Terms of Use checkbox to enable the Register button:

image

2C. Click Register to create your account and open the US Air Carrier Flight Delays details page.

3. On the US Air Carrier Flight Delays details page click the Subscribe button:

image

Note: The subscription is free, so you don’t need to provide credit card details.

4. Mark the I Have Read and agree to the Above Publisher’s Offer Terms and Privacy Policy check box:

image

5. Click the Sign Up button to open the Thank You page:

image

6. Click the Explore This Data Set link to open the Build Your Query to Get Started form. See the Building an OData URL Query and Displaying Data section below.


Microsoft Codename “Data Hub”

1. Open the OakLeaf Systems “Data Hub” landing page at https://oakleaf.clouddatahub.net/:

image

2. Click the Transportation and Navigation menu link to open the page for OakLeaf datasets in that category:

image

3. Click the US Air Carrier Flight Delays link to open its details page:

image

3A. Click the Sign in to Add to Your Collection button to open the Windows Live Sign In page. Type your Windows Live ID and Password in the text boxes and click Sign In to open the Registration page, if you aren’t already registered:

image

3B. Click Register to open the US Air Carrier Flight Delays details page.

image

4. Click the Add to Collection button to add the dataset to your datasets collection:

image

5. Click the Use link to open the same Build Your Query to Get Started page as the DataMarket version. Add query criteria and click Run Query to return the same dataset firstpage:

image


Building an OData URL Query and Displaying Data

1. Type discrete values to filter by in the appropriate text boxes:

image

Note: The preceding query returns the first 100 records for Southwest Airlines (WN) flights to Oakland Airport (OAK) during February 2012. If you omit all optional parameters you return the first 100 records in the order in which they were entered.

2. Click the Run Query button to return 100 rows and display the first about 20 rows in the default grid:

image

The OData URL query syntax is: https://api.datamarket.azure.com/Data.ashx/oakleaf/US_Air_Carrier_Flight_Delays_Incr/On_Time_Performance?$filter=Carrier%20eq%20%27WN%27%20and%20Dest%20eq%20%27OAK%27%20and%20Month%20eq%202%20and%20Year%20eq%202012&$top=100

Note: If the dataset had a transaction limit for free trials, each 100 rows you download represents a single transaction. Free OakLeaf datasets don’t have a trial transaction limit.

3. Click the right-arrow button (above the FlightData column header) to display the next set of rows. Alternatively, type a page number in the Go To text box and click Go to display those rows.

4. Click the XML button to display the rows in the Open Data (OData) format:

image

Note: Clicking the Chart button lets you build a query to visualize the data. Unfortunately, Flight Delay data doesn’t lend itself to visualizations other than by histograms, which the current Data Explorer version doesn’t support directly.


Using LINQPad to Execute URL Flexible Queries

imageMy Querying Microsoft’s Codename “Social Analytics” OData Feeds with LINQPad of 11/5/2012 describes how to obtain Joseph Alabahari free LINQPad application and query OData feeds from the Codename “Social Analytics” live Twitter feed with it. Following are the procedure and examples for querying the DataMarket’s US Air Carrier Flight Delays dataset:

1. Download and install LINQPad 4.42.1 or later for .NET Framework 4.0 or later from the LINQPad Web site.

2. Log in to the Data Market’s OakLeaf Systems dataset with the Windows Live ID you used for the above examples, proceed to the Build Your Query To Get Started page, select the Service Root URL value, https://api.datamarket.azure.com/oakleaf/US_Air_Carrier_Flight_Delays_Incr/, copy it to the clipboard and paste it to a Notepad instance.

image

3. Click the Primary Account Key label’s Show link to display the Account Key value. Select the value, copy it to the clipboard, and paste it to the same Notepad instance.

image

4. Optionally, execute the URI-encoded URL query and copy its contents to the Clipboard and then to the Notepad instance: https://api.datamarket.azure.com/Data.ashx/oakleaf/US_Air_Carrier_Flight_Delays_Incr/On_Time_Performance?$filter=Carrier%20eq%20%27WN%27%20and%20Dest%20eq%20%27OAK%27%20and%20Month%20eq%202%20and%20Year%20eq%202012&$top=100

Note: %20 = ASCII(0x20) = space and %27 = ASCII(0x27) = ' (single quote).

5. Launch LINQPad and click the Add Connection link to open the Choose Data Context dialog. Accept the default Build Data Context automatically option and select the Microsoft DataMarket Service item in the LINQPad Drivers list:

image

6. Click the Next button to open the DataMarket Connection dialog, copy the Service Root URL to the Clipboard and paste it to the DataMarket Service URI text box. Do the same with the Account Key value, add a friendly name (US Air Carrier Flight Delays) for the connection, and mark the Remember This Connection check box:

image

5. Click the Test button to verify the URI and Key, dismiss the Connection Successful message, click OK to add the connection, expand the connection node and its On_Time_Performance child node and drag the On_Time_Performance link into the Query 1 tab’s text box:

image

6. Click the button with the green arrow icon to execute the query and display the result in a grid:

image

7. Click the Request Log button to display the request syntax:

image

Note: The Request syntax is https://api.datamarket.azure.com/oakleaf/US_Air_Carrier_Flight_Delays_Incr/On_Time_Performance, but the preceding URL requires a user name and password.

Executing LINQ Queries with C# Expressions

LINQPad lets you execute Language-Independent Query (LINQ) expressions, which the software translates to OData URL queries, in C# or VB. LINQ syntax is similar to SQL, but the order of the expression clauses differs by placing the from clause first and select clause last, as in:

from o in On_Time_Performance
where o.Carrier == "WN" && o.Month == 2 && o.Year == 2012
&& o.Dest == "OAK" && o.DepDelayMinutes >= 10
orderby o.DepDelayMinutes descending
select o

which returns a subset of the preceding rowset where the departure delay is equal to or greater than 10 minutes. LINQ queries offer much more versatile query capability than the Query Builder’s simple value based filters.

You can learn more about LINQ query syntax from articles linked to the MSDN Library’s LINQ Query Expressions (C# Programming Guide). My Professional ADO.NET 3.5 with LINQ and the Entity Framework book for Wiley/WROX has seven chapters about writing LINQ queries.

To execute the preceding expression and view the resulting OData URL query, do the following:

1. Copy and paste (or type) the LINQ query expression above into LINQPad’s query pane and click the execute button (with the green arrow) to display the first few rows in descending order of departure delay time:

image

2. Click the Request log button to display the first part of the URL query syntax:

image

The complete URL query is: https://api.datamarket.azure.com/oakleaf/US_Air_Carrier_Flight_Delays_Incr/On_Time_Performance()?$filter=((((Carrier eq 'WN') and (Month eq 2)) and (Year eq 2012)) and (Dest eq 'OAK')) and (DepDelayMinutes ge 10)&$orderby=DepDelayMinutes desc

Notice that the space (%20) and single-quote (%27) symbols are omitted in the URL query displayed. These are added by URI-encoding the query before sending it to the DataMarket.

Customizing the Projection with LINQ Queries

3. To customize the projection (i.e., columns displayed) by omitting Month, Year, and RowId columns and rearranging column order, replace select o with the following expression:

select new
{
    o.Carrier,
    o.FlightDate,
    o.Dest,
    o.Origin,
    o.ArrDelayMinutes,
    o.DepDelayMinutes
}

4. Execute the query to return the following data:

image

5. Click the Request Log button to display this URL query, which returns an IQueryable<Anonymous> type, by adding a $select clause:

https://api.datamarket.azure.com/oakleaf/US_Air_Carrier_Flight_Delays_Incr/On_Time_Performance()?$filter=((((Carrier eq 'WN') and (Month eq 2)) and (Year eq 2012)) and (Dest eq 'OAK')) and (DepDelayMinutes ge 10)&$orderby=DepDelayMinutes desc&$select=Carrier,FlightDate,Dest,Origin,ArrDelayMinutes,DepDelayMinutes

Note: LINQPad’s driver for OData connections doesn’t support SQL queries.


Exporting Data to Excel PowerPivot Tables and Charts

My two-year-old Enabling and Using the OData Protocol with SQL Azure of 3/26/2011 explained “how to enable the OData protocol for specific SQL Azure instances and databases, query OData sources, and display formatted Atom 1.0 data from the tables in Internet Explorer 8 and Excel 2010 PowerPivot tables.” The post also provided a “comparison of PowerPivot for Excel and Tableau (see the end of the “Working with OData Feeds in PowerPivot for Excel 2010” section.)” 

My later Using the Microsoft Codename “Social Analytics” API with Excel PowerPivot and Visual Studio 2010 of 11/2/2011 explained how to open an empty PowerPivot worksheet and connect to a dataset provided by the Windows Azure Marketplace DataMarket.

This example uses the Explorer’s Export to Excel 2010’s PowerPivot feature to create  worksheet and chart of average departure delays by air carrier for a particular month. This procedure assumes that you have a MarketPlace account and a free subscription to the US Air Carrier Flight Delays dataset.

1. Download and install the x86 or x64 version of Excel PowerPivot from the download page to match the bitness of your Office 2010 installation.

2. Open the US Air Carrier Flight Delays dataset, sign into the DataMarket with an account that has a subscription to the dataset, click the Explore This Dataset link, specify 2 as the Month and 2012 as the Year for optional parameters, and click the Export button to open the Export pane:

image

Note: Specifying a single month and year limits the spreadsheet to about 500,000 rows.

3. Accept the Excel PowerPivot option and click the Lower Download button and click Open when asked if you want to open or save the ServiceQuery.atomsvc file from datamarket.azure.com to open Excel’s Table Import Wizard dialog. Replace the default Friendly Connection Name with US Air Carrier Flight Delays for this example, copy your Account Key from Notepad and paste it into the Account Key Text box, and mark the Save My Account key:

image

Note: If you don’t have the Account Key copy in Notepad, click the Find button to open the Account Keys page, select your Account Key, and then copy and paste it to the text box.

4. Click Next to open the Wizard’s Select Tables and Views dialog:

image

Note: If you receive the following message:

image

the account key you entered probably is the problem. Log in with the account you used to create the DataMarket offering, and repeat the process in the preceding note.

5. Click the Preview & Filter button to display the Preview Selected Table dialog. Clear the DayofMonth, Month, RowId and Year check boxes to omit the columns from the PowerPivot worksheet:

image

6. Click OK to close the dialog and click Finish to begin downloading data. The Status column displays the number of rows downloaded:

image

7. After all rows download, click Close to dismiss the Wizard and view the data in a worksheet:

image

8. Open the PivotTable gallery and select Pivot table to open a new sheet with a PowerPivot Field List pane. Mark the Carrier and DepDelayMinutes check boxes to add Row Labels for Carriers and default Sum of DepDelayMinutes values:

image

9. Click the arrow in the Values text box’s Sum of DepDelay… item, select Edit Measure to open the Measure Settings dialog and select Average as the aggregation function:

image

10. Click OK to close the dialog and display the average values:

image

11. Click the PivotTable button again, select Pivot Chart and specify this worksheet:

image

12. Click the Field List to hide the pane, change column C’s heading to Average, change the cell format of Column C to Number, expand the chart by dragging the corners, edit the title as shown below, and select and delete the Total legend:

image


Visualizing Flight Delay Data with Tableau Software

Tableau Software (@Tableau) publishes data visualization software with an emphasis on big data. According to the Tableau blurb on OakLeaf’s US Air Carrier Flight Delays offering on the Windows Azure Marketplace DataMarket:

imageTableau provides drag-and-drop data visualization based on best practices and patented technology from Stanford University. Tableau allows you to publish dashboards to the web with one click. It’s rapid-fire business intelligence that anyone can use.

According to the publisher:

Tableau Public is a free service that lets you create and share data visualizations on the web. Thousands use it to share data on websites and blogs and through social media like Facebook and Twitter. Tableau Public allows you to see data efficiently and powerfully without any programming.

Easy drag & drop interface:
  • No programming language
  • No plug-ins
  • No Flash, so it shows up on the iPad …
How it works:

Tableau Public visualizations and data are always public. Anyone can interact with your visualizations, see your data, download it, and create their own visualizations from it.

When you save your visualizations, it will be to the publically accessible Tableau Public web servers -- nothing is saved locally on your computer. You can then embed your visualization on your blog or website or share it through social media or email.

Tableau Public can connect to several data sources, including Microsoft Excel, Microsoft Access, and multiple text file formats. It has a limit of 100,000 rows of data allowed in any single file and there is a 50 megabyte limit on storage space for data. [Emphasis added.]

Warning: As noted in the preceding quotation, Tableau Public works with a maximum of 100,000 data rows, which you won’t discover until you attempt to use a query that returns more than that amount of data. 

The DataMarket provides a terse Using Marketpalce [sic] Datasets with Tableau Public tutorial with no screen captures of successful visualizations. The following procedure doesn’t suffer from those omissions:

1. Download Tableau Public 7.0 from its download page. The download requires entering your email address.

Tip: The Thank You for Downloading … page appears immediately, but you must wait for a few minutes for the Do You Want to Run or Save TableauDesktop.msi … ? message to appear before taking additional actions.

image

2. Click Run to install the software, accept the license agreement, and click Install. Watch the Getting Started video, if you want. Close the Book 1 page.

3. Open the US Air Carrier Flight Delays dataset, sign into the DataMarket with an account that has a subscription to the dataset, click the Explore This Dataset link, and specify LAX as the optional parameter, which returns 86.940 rows with the current dataset. Click the Export button to open the Export pane, and mark the Tableau option in the Export to Program section:

image

4.  Click the lower Download button and click Open when presented with the following message:

image

5. Click the Show button (see step 3’s screen capture) to display your Primary Account Key, copy it to the Clipboard and paste it in Tableau Public’s Login dialog:

image

6. Click OK to download the data and open Tableau Public’s main page and drag the Carrier dimension to the top Drop Field Here region, as shown here:

image

7. Select the DepDelayMinutes measure and drag it to the Rows shelf; select the Carrier dimension and drag it to the Columns shelf to enable appropriate chart styles in the Show Me gallery.

8. Open the Rows menu, select Measure (Sum) and choose Average in the submenu:

image

9. Here’s the basic column chart from the preceding steps.

image

You can edit the chart and axis titles, but due to the limited number of rows accommodated, further work usually isn’t warranted.

I have recommended that the DataMarket team add notice of the limitation in the number of rows supported by Tableau Public.


by Roger Jennings (--rj) (noreply@blogger.com) at May 19, 2012 01:44 PM

May 18, 2012

Amazon Web Services

Gravatar

Elastic Load Balancer - Console Updates and IPv6 Support for 2 Additional Regions

You can now manage the listeners, SSL certificates, and SSL ciphers for an existing Elastic Load Balancer from within the AWS Management Console. This enhancement makes it even easier to get started with Elastic Load Balancing and simpler to maintain a highly available application using Elastic Load Balancing. While this functionality has been available via the API and command line tools, many customers told us that it was critical to be able to use the AWS Console to manage these settings on an existing load balancer.

With this update, you can add a new listener with a front-end protocol/port and back-end protocol/port:

If the listener uses encryption (HTTPS or SSL listeners), then you can create or select the SSL certificate:

In addition to selecting or creating the certificate, you can now update the SSL protocols and ciphers presented to clients:

We have also expanded IPv6 support for Elastic Load Balancing to include the US West (Northern California) and US West (Oregon) regions.

-- Jeff;

 

by AWS Evangelist at May 18, 2012 06:03 PM

OakLeaf Systems

Gravatar

Social meet up on Twitter for Meet Windows Azure on June 7th

• Updated 5/17/2012 with the current MEET Windows Azure Blog Relay members.

Magnus Mårtensson (@noopman) posted on 5/16/2012 an article of the above name, which begins:

imageHere’s a perhaps rather redundant event for you but it should be kind of fun: MEET Windows Azure on Twitter (+ Beer). The idea is to list people who have a twitter account and intend to follow the MEET Windows Azure event via live streaming on June 7th (1pm PDT).

imageSo see you online for the event on the 7th! My Twitter handle is @noopman!

Here are the current MEET Windows Azure Blog Relay members as of 3/17/2012 (in order of joining):

Thanks to @noopman!


by Roger Jennings (--rj) (noreply@blogger.com) at May 18, 2012 02:03 PM

SearchCloudComputing (Carl Brooks)

Gravatar

An open dialog on open source cloud

Advocates of open source cloud may swear by its ?no rules? development environment, but a lack of in-house expertise may just have you swearing.

Add to digg Add to StumbleUpon Add to del.icio.us Add to Google

by Michelle Boisvert, Senior Site Editor(mboisvert@techtarget.com at May 18, 2012 12:17 PM

William Vambenepe

Gravatar

REST + RDF, finally a practical solution?

The W3C has recently approved the creation of the Linked Data Platform (LDP) Working Group. The charter contains its official marching orders. Its co-chair Erik Wilde shared his thoughts on the endeavor.

This is good. Back in 2009, I concluded a series of three blog posts on “REST in practice for IT and Cloud management” with:

I hereby conclude my “REST in practice for IT and Cloud management” series, with the intent to eventually start a “Linked Data in practice for IT and Cloud management” series.

I never wrote that later part, because my work took me away from that pursuit and there wasn’t much point writing down ideas which I hadn’t  put to the test. But if this W3C working group is successful, they will give us just that.

That’s a big “if” though. Religious debates and paralyzing disconnects between theorists and practitioners are all-too-common in tech, but REST and Semantic Web (of which RDF is the foundation) are especially vulnerable. Bringing these two together and trying to tame both sets of daemons at the same time is a daring proposition.

On the other hand, there is already a fair amount of relevant real-life experience (e.g. data.gov.uk – read Jeni Tennison on the choice of Linked Data). Plus, Erik is a great pick to lead this effort (I haven’t met his co-chair, IBM’s Arnaud Le Hors). And maybe REST and RDF have reached the mythical point where even the trolls are tired and practicality can prevail. One can always dream.

Here are a few different ways to think about this work:

The “REST doesn’t provide interoperability” perspective

RESTful API authors too often think they can make the economy of a metamodel. Or that a format (like XML or JSON) can be used as a metamodel. Or they punt the problem towards defining a multitude of MIME types. This will never buy you interoperability. Stu explained it before. Most problems I see addressed via RESTful APIs, in the IT/Cloud management realm, are modeling problems first and only secondarily protocol/interaction problems. And their failures are failures of modeling. LDP should bring modeling discipline to REST-land.

The “RDF was too much, too soon” perspective

The RDF stack is mired in complexity. By the time people outside of academia had formed a set of modeling requirements that cried for RDF, the Semantic Web community was already deep in the weeds and had overloaded its basic metamodel with enough classification and inference technology to bury its core value as a simple graph-oriented and web-friendly metamodel. What XSD-fever was to making XML seem overly complex, OWL-fever was to RDF. Tenfold.

Everything that the LDP working group is trying to achieve can be achieved today with existing Semantic Web technologies. Technically speaking, no new work is needed. But only a handful of people understand these technologies enough to know what to use and what to ignore, and as such this application doesn’t have a chance to materialize. Which is why the LDP group is needed. But there’s a reason why its starting point document is called a “profile”. No new technology is needed. Only clarity and agreement.

For the record, I like OWL. It may be the technology that most influenced the way I think about modeling. But the predominance of RDFS and OWL (along with ugly serializations) in Semantic Web discussions kept RDF safely out of sight of those in industry who could have used it. Who knows what would have happened if a graph query language (SPARQL) had been prioritized ahead of inference technology (OWL)?

The Cloud API perspective

The scope of the LDP group is much larger than Cloud APIs, but my interest in it is mostly grounded in Cloud API use cases. And I see no reason why the requirements of Cloud APIs would not be 100% relevant to this effort.

What does this mean for the Cloud API debate? Nothing in the short term, but if this group succeeds, the result will probably be the best technical foundation for large parts of the Cloud management landscape. Which doesn’t mean it will be adopted, of course. The LDP timeline calls for completion in 2014. Who knows what the actual end date will be and what the Cloud API situation will be at that point. AWS APIs might be entrenched de-facto standards, or people may be accustomed to using several APIs (via libraries that abstract them away). Or maybe the industry will be clamoring for reunification and LDP will arrive just on time to underpin it. Though the track record is not good for such “reunifications”.

The “ghost of WS-*” perspective

Look at the 16 “technical issues” in the LCD working group charter. I can map each one to the relevant WS-* specification. E.g. see this as it relates to #8. As I’ve argued many times on this blog, the problems that WSMF/WSDM/WS-Mgmt/WS-RA and friends addressed didn’t go away with the demise of these specifications. Here is yet another attempt to tackle them.

The standards politics perspective

Another “fun” part of WS-*, beyond the joy of wrangling with XSD and dealing with multiple versions of foundational specifications, was the politics. Which mostly articulated around IBM and Microsoft. Well, guess what the primary competition to LDP is? OData, from Microsoft. I don’t know what the dynamics will be this time around, Microsoft and IBM alone don’t command nearly as much influence over the Cloud infrastructure landscape as they did over the XML middleware standardization effort.

And that’s just the corporate politics. The politics between standards organizations (and those who make their living in them) can be just as hairy; you can expect that DMTF will fight W3C, and any other organization which steps up, for control of the “Cloud management” stack. Not to mention the usual coo-petition between de facto and de jure organizations.

The “I told you so” perspective

When CMDBf started, I lobbied hard to base it on RDF. I explained that you could use it as just a graph-based metamodel, that you could  ignore the ontology and inference part of the stack. Which is pretty much what LDP is doing today. But I failed to convince the group, so we created our own metamodel (at least we explicitly defined one) and our own graph query language and that became CMDBf v1. Of course it was also SOAP-based.

KISS and markup

In closing, I’ll just make a plea for practicality to drive this effort. It’s OK to break REST orthodoxy. And not everything needs to be in RDF either. An overarching graph model is needed, but the detailed description of the nodes can very well remain in JSON, XML, or whatever format does the job for that node type.

All the best to LDP.

by @vambenepe at May 18, 2012 07:54 AM

Amazon Web Services

Gravatar

RDS Read Replicas in the Virtual Private Cloud

You can now launch Amazon Relational Database Service (RDS) MySQL Read Replicas inside of a Virtual Private Cloud (VPC).

Amazon RDS removes the headaches of running a relational database reliably at scale, allowing Amazon RDS customers to focus on innovation for their customers. Read Replicas enables you to elastically scale out beyond the capacity constraints of a single DB Instance for read-heavy database workloads.

You can now create one or more replicas of a given “source” DB Instance and serve incoming read traffic from multiple copies of your data within a VPC environment. You can create a Read Replica with a few clicks of the AWS Management Console or using the CreateDBInstanceReadReplica API.

Amazon VPC allows you to customize the network configuration to closely resemble a traditional network that you might operate in your own datacenter. I described the process of launching a DB Instance inside of a VPC in an earlier post.

Amazon RDS in VPC enables you to have a DB instance within a private network powering a public web application. The DB instance is on a private subnet, which does not have a public IP address. You can also use Amazon RDS + VPC to run corporate applications that are not intended to be accessed from the Internet.

-- Jeff;

by AWS Evangelist at May 18, 2012 01:31 AM

May 17, 2012

Amazon Web Services

Gravatar

New Mechanical Turk Categorization App

Categorization is one of the more popular use cases for the Amazon Mechanical Turk. A categorization HIT (Human Intelligence Task) asks the Worker to select from a list of options. Our customers use HITs of this type to assign product categories, match URLs to business listings, and to discriminate between line art and photographs.

Using our new Categorization App, you can start categorizing your own items or data in minutes, eliminating the learning curve that has traditionally accompanied this type of activity. The app includes everything that you need to be successful including:

  1. Predefined HITs (no HTML editing required).
  2. Pre-qualified Master Workers (see Jinesh's previous blog post on Mechanical Turk Masters).
  3. Price recommendations based on complexity and comparable HITs.
  4. Analysis tools.

The Categorization App guides you through the four simple steps that are needed to create your categorization project.

First, you create the project, assign a name to it, and enter the question that you want to ask of the workers.

Next, you provide the categories that the Master Workers will use. You simply enter the names and the App will generate the HTML for you. You also have the opportunity to supply instructions as part of the step.

The next step is to upload the data to be categorized. You simply upload a CSV file and select the fields that you'd like the Workers to see. If one of your fields contains links to images to be displayed as part of the HIT, you can also set that up in this step.

Finally, you review the pricing information and the total cost for your HIT. Once everything looks good, you go ahead and Publish the HITs and await the results.

Two Master Workers will handle each HIT. After the Workers have finished all of the HITs in the project, you can use the Result Analysis tools to see a summary of the results. You'll be able to see how often both of the Master Workers agreed on a categorization, and how often they disagreed. You can also download the actual categorization data for each of your items.

Read more about this feature on the Mechanical Turk Blog.

-- Jeff;

 

by AWS Evangelist at May 17, 2012 11:34 PM

OakLeaf Systems

Gravatar

Windows Azure Sessions at the Microsoft Worldwide Partner Conference 2012

imageHere are two Windows Azure-related sessions at the Worldwide Partner Conference (@WPC) 2012 to be held 7/8 through 7/12 in Toronto, Canada. From the session list:

U.S. ISV: Why Build Your Application on Microsoft Azure?

  • Session Type: Breakout Session
  • Track: US Sub
  • Session Details:
imageJoin three ISV partners as they share their journey to Windows Azure. They will discuss the benefits of Windows Azure over other platforms, how it has changed their business, and how they have planned their business to be successful using a cloud strategy. The partners are Quest Software (who chose a horizontal application that expands their current offerings on the value of SQL Server 2012), Johnson Controls (developed a vertical manufacturing application), and Commvault (released a leading storage application that compliments their on-premises solutions). Target audience: ISV.

U.S. Windows Azure: Partner-to-Partner Business Opportunities with Windows Azure

  • Session Type: Breakout Session
  • Track: US Sub
  • Session Details:
imageCapitalize on the Windows Azure platform to offer new, innovative solutions and develop new partner-to-partner business relationships that benefit your customers and accelerate your business plans. Hear ways you can go to market and use the Windows Azure platform:
  • How do I market my new Windows Azure application?
  • How do I build a channel from the Microsoft partner ecosystem?
And, network with born in the cloud partners (BICs), cloud software vendors (CSVs), SIs, and ISVs to identify and build new business opportunities. Target audience: Born in the Cloud partner (BIC), cloud software vendor (CSV), SI, ISV.

by Roger Jennings (--rj) (noreply@blogger.com) at May 17, 2012 06:30 PM

SearchCloudComputing (Carl Brooks)

Gravatar

HP and Dell dominate server market but struggle in cloud

HP and Dell are the largest hardware makers, but the two companies struggle to present strategies that portray them as serious cloud players.

Add to digg Add to StumbleUpon Add to del.icio.us Add to Google

by Stuart J. Johnston, Senior News Writer(editor@searchcloudcomputing.com at May 17, 2012 12:49 PM

OakLeaf Systems

Gravatar

“Meet Windows Azure” Event Scheduled for 6/7/2012 in San Francisco

• Updated 5/17/2012 with event schedule (see end of post.) I received my ticket to attend.

imageThe Windows Azure Team announced on 5/15/2012 “An innovative online and in-person event introducing cloud-based development technologies from Windows Azure” on June 7, 2012 at 1 PM PDT. Click here to open the landing billboard:

image

Register here to attend online and for a chance for a ticket to attend in person:

image

Hope to See You There …

• MEET Windows Azure Schedule

Register and Eat
11:30 AM - 1:00 PM
Connect with other technology innovators and get ready to meet Azure.

MEET Windows Azure
1:00 PM - 5:00 PM
See the technology with a keynote by Scott Guthrie, CVP Windows Azure Application Platform -- and special guests you don’t want to miss.

Contest / Hackathon
5:00 PM - 11:30 PM
Get a kick-start in building the next great thing and win some prizes with Windows Azure and Twillio.

Food Trucks and Beer Garden
5:00 PM - 6:00 PM
Fuel up for a great night as we host food trucks and a beer garden.

Party
6:00 PM - Midnight
Don’t miss the afterparty, with an open bar, great eats, and a live show with Steve Aoki.

Check the attendees claiming attendance at the San Francisco venue by clicking here. Many are from Europe or farther away (e.g. China), so I don’t expect to see them at the venue.

by Roger Jennings (--rj) (noreply@blogger.com) at May 17, 2012 11:37 AM

Amazon Web Services

Gravatar

AWS Elastic Beanstalk Now Available in Europe

Today we are expanding the availability of AWS Elastic Beanstalk to the EU (Ireland) region. This comes hot on the heels of our recent announcement of .NET support and our launch in Japan, and gives you the ability to choose any one of three regions for deployment (see the AWS Global Infrastructure map for more information).

With Elastic Beanstalk, you retain full control over the resources running your application and you can easily manage and adjust settings to meet your needs. Because Elastic Beanstalk leverages services like Amazon EC2 and Amazon S3, you can run your application on the same highly durable and highly available infrastructure.

Elastic Beanstalk automatically scales your application up and down based on default Auto Scaling settings. You can easily adjust Auto Scaling settings based on your specific application's needs.

You have your choice of three separate development languages and tools when you use Elastic Beanstalk:

To get started with Elastic Beanstalk, visit the AWS Elastic Beanstalk Developer Guide.

I should mention that we are looking for highly motivated software developers and product managers who want to work on leading edge, highly scaled services such as AWS CloudFormation and AWS Elastic Beanstalk. If you are interested, send us your resume to aws-cloudformation-jobs at amazon.com or aws-elasticbeanstalk-jobs at amazon.com. Come and help us make history!

-- Jeff;

PS - I recently interviewed Saad Ladki of the Elastic Beanstalk team on The AWS Report. You can watch the video to learn more about Elastic Beanstalk and how our customers are putting it to use.

by AWS Evangelist at May 17, 2012 02:31 AM

May 16, 2012

OakLeaf Systems

Gravatar

Windows Azure and Cloud Computing Posts for 5/15/2012+

A compendium of Windows Azure, Service Bus, EAI & EDIm Access Control, Connect, SQL Azure Database, and other cloud-computing articles. image222

image433

Note: This post is updated daily or more frequently, depending on the availability of new articles in the following sections:


Azure Blob, Drive, Table, Queue and Hadoop Services

image_thumb11image_thumb3_thumb

No significant articles today.

image

<Return to section navigation list>

SQL Azure Database, Federations and Reporting

Clint Edmonson (@clinted) answered Why can’t I connect to my SQL Azure database? in a 5/15/2012 post to the US DPE Azure Connection blog:

imageWe’ve been running our Azure Hands On Experience labs throughout the US Central Region for the last month or so and the SQL Azure exercises require the user to connect a newly created SQL Azure database from their development workstation.

imageInevitably, at least one machine is configured with one or more settings that prevent a connection from succeeding, either from SQL Server Management Studio or Visual Studio’s Data Connections tree in Server Explorer (my preferred method btw).

Here are the troubleshooting tips we’ve been using in the labs:

1. Double check your software versions.

If you’re using SQL Server Management Studio, it needs to be SQL Server Management Studio 2008 R2 (the free express or full edition). If you go to the Help | About menu, the dialog will say R2 in the logo product logo.

image

If you want to connect from directly inside Visual Studio 2010, make sure you have Visual Studio 2010 Service Pack SP1 installed as well. If you go to the Help | About menu item in Visual Studio 2010, the popup dialog will say SP1 immediately after the product version number. Also, make sure you have the latest Windows Azure SDK installed. It will set up the hooks for azure database connectivity from within Visual Studio 2010.

image

2. Double check your SQL Server Native Client Configuration

If you’re getting an error that mentions named pipes it means the SQL client networking protocol settings on your development machine are set to connect to servers using the named pipes protocol by default. This won’t work with SQL Azure because named pipes is a network protocol optimized for local LAN traffic. All connections to SQL Azure are done using TCP/IP so you need TCP/IP enabled.

To do this, run SQL Server Configuration Manager go to your start menu and select All Programs | SQL Server 2008 R2 | Configuration Tools | SQL Server Configuration Manager.

image

You need to make sure the TCP/IP protocol is enabled under the SQL Native Client 10.0 Configuration tree item. (On 64 bit machines you might have both 32 bit and 64 bit tree branches – be sure to enable TCP/IP in both). It’s generally a good ide to set TCP/IP as a higher priority protocol over named pipes as TCP/IP is the most common network protocol for SQL severs these days. This msdn article has more details about the configuration options.

3. Check your client side firewall rules.

SQL Servers communicate over TCP/IP through port 1433. The Windows Firewall software that shipped with XP SP2, Vista and Windows 7 is pretty good about about asking you when a new port is about to be used and letting you choose if you want it opened. If you're using another client side firewall solution, be sure to enable port 1433 for outbound connections. If your firewall software is set on an application by application basis, set up a rule for SQL Server Management Studio or Visual Studio appropriately.

4. Check the SQL Azure server firewall rules

When you create your SQL Azure database, you were asked to set up firewall rules to allow outside connections to the server. When you add a rule the dialog box lets you to set a range of IP addresses. It also shows your current IP address. In most cases your current IP address is chosen from a bank of addresses on the network you’re connected to.

To check these settings, log into your windows azure account, navigate to the databases area, and choose the database you’re trying to connect to from the tree on the left side of the screen. You should see the dropdown option to view and edit the Firewall Rules for your database server. Add or edit a rule to enable access for the development machine you’re trying to connect with.

image

As a minimum, you should add your current IP address as it is shown at the bottom. The hands on labs recommend adding a large enough address range so that if you have to reboot and end up acquiring a slightly different address you’ll still be able to connect with having to add another rule. You cans that I did this here.

Note: The address shown for Your current IP address is the public address that the server sees your internet traffic coming from. In the big chain of firewalls between you and your SQL Azure database, this is the address as it will be recognized from the network you currently reside in. It may or may not match the actual IP address your individual machine says it’s using. That’s OK. What it’s seeing is the most public internet facing firewall that you’re going through to get to the database. That’s all the server needs to know to allow you in.

5. Check your corporate firewall rules

If you still can’t connect and all of the above items check out and you’re currently connected to corporate network, chances are your corporate IT folks have another firewall in place to keep the baddies out and to keep you from accidentally letting them in.

If you’ve got an alternative method to connect to the internet such as a wireless hotspot or a guest network, try connecting to it and see if your attempt to reach your SQL Azure server succeeds. If so, then you know there’s a corporate firewall blocking you.

You’ll have to check with your network admins to see if they allow outbound TCP/IP connections on port 1433. Chances are you’ll have to do some paperwork to get that set up.

These are the troubleshooting steps we’ve found fruitful during labs. If you’re still stuck beyond this point, get a colleague to try connecting to your server from their machine. If they succeed, then you know there’s a configuration difference. It’s just a matter of finding the difference.


<Return to section navigation list>

MarketPlace DataMarket, Social Analytics, Big Data and OData

My (@rogerjenn) updated Accessing US Air Carrier Flight Delay DataSets on Windows Azure Marketplace DataMarket and “DataHub” added a Exporting Data to Excel PowerPivot Tables and Charts section on 5/16/2012:

Exporting Data to Excel PowerPivot Tables and Charts

My two-year-old Enabling and Using the OData Protocol with SQL Azure of 3/26/2011 explained “how to enable the OData protocol for specific SQL Azure instances and databases, query OData sources, and display formatted Atom 1.0 data from the tables in Internet Explorer 8 and Excel 2010 PowerPivot tables.” The post also provided a “comparison of PowerPivot for Excel and Tableau (see the end of the “Working with OData Feeds in PowerPivot for Excel 2010” section.)”

My later Using the Microsoft Codename “Social Analytics” API with Excel PowerPivot and Visual Studio 2010 of 11/2/2011 explained how to open an empty PowerPivot worksheet and connect to a dataset provided by the Windows Azure Marketplace DataMarket.

This example uses the Explorer’s Export to Excel 2010’s PowerPivot feature to create worksheet and chart of average departure delays by air carrier for a particular month. This procedure assumes that you have a MarketPlace account and a free subscription to the US Air Carrier Flight Delays dataset.

1. Download and install the x86 or x64 version of Excel PowerPivot from the download page to match the bitness of your Office 2010 installation.

2. Open the US Air Carrier Flight Delays dataset, sign into the DataMarket with an account that has a subscription to the dataset, click the Explore This Dataset link, specify 2 as the Month and 2012 as the Year for optional parameters, and click the Export button to open the Export pane:

image

Note: Specifying a single month and year limits the spreadsheet to about 500,000 rows.

3. Accept the Excel PowerPivot option and click the Lower Download button and click Open when asked if you want to open or save the ServiceQuery.atomsvc file from datamarket.azure.com to open Excel’s Table Import Wizard dialog. Replace the default Friendly Connection Name with US Air Carrier Flight Delays for this example, copy your Account Key from Notepad and paste it into the Account Key Text box, and mark the Save My Account key:

image

Note: If you don’t have the Account Key copy in Notepad, click the Find button to open the Account Keys page, select your Account Key, and then copy and paste it to the text box.

4. Click Next to open the Wizard’s Select Tables and Views dialog:

image

Note: If you receive the following message:

image

the account key you entered probably is for a non-administrator user. Log in with an administrative account, and repeat the process in the preceding note.

5. Click the Preview & Filter button to display the Preview Selected Table dialog. Clear the DayofMonth, Month, RowId and Year check boxes to omit the columns from the PowerPivot worksheet:

image

6. Click OK to close the dialog and click Finish to begin downloading data. The Status column displays the number of rows downloaded:

image

7. After all rows download, click Close to dismiss the Wizard and view the data in a worksheet:

image

8. Open the PivotTable gallery and select Pivot table to open a new sheet with a PowerPivot Field List pane. Mark the Carrier and DepDelayMinutes check boxes to add Row Labels for Carriers and default Sum of DepDelayMinutes values:

image

9. Click the arrow in the Values text box’s Sum of DepDelay… item, select Edit Measure to open the Measure Settings dialog and select Average as the aggregation function:

image

10. Click OK to close the dialog and display the average values:

image

11. Click the PivotTable button again, select Pivot Chart and specify this worksheet:

image

12. Click the Field List to hide the pane, change column C’s heading to Average, change the cell format of Column C to Number, expand the chart by dragging the corners, edit the title as shown below, and select and delete the Total legend:

image


Miguel Parajo wrote ISV Guest Post Series: Softlibrary and Kern4Cloud on Windows Azure on 5/15/2012:

Editor’s Note: Today’s post, written by Miguel Parejo, CTO at Softlibrary describes how the company uses Windows Azure and the Windows Azure Marketplace to run and sell its multi-tenant information management service.

imageSoftlibrary is a company started in 1988 in Barcelona (Spain). Since then, it has always been involved in information management, and providing cutting-edge custom solutions to our customers. For that purpose, the company adopted Microsoft platforms and architectures from its very beginning.

imageKern4Cloud is a multi-tenant service focused on information management whether it comes from a corporate nature or not. It can handle all the information lifecycle providing a set of tools for publication, categorization and classification, lexical-semantic and thesauruses systems, version control, multilingual, live-translation and workflow processes.

We chose Windows Azure because it resides in certified data centers where information and services are kept in a reliable and secure way. Windows Azure resources are capable of being stretched in order to provide high-performance solutions, as well.

Every single portion of existing information within the system is saved in XML format, so we can also look at Kern4cloud as a black-box that transforms heterogeneous sources into standard and internationalized ones.

Let’s see how we can accomplish a typical flow with Kern4Cloud. Your company likely has a Privacy Policy statement and it will probably change over time. Once you have imported the first version you can create new versions, duplicate existing versions, and even translate versions on-the-fly using the main translation engines available as part of the solution. The system can also convert your files to pure XML so you can later edit them with its own editor, called X.Edit, a WYSIWYM editor. All this can be accomplished with the web component called K4C.Workplace. Your company is likely structured in a way that some departments must give their consent before publishing. Given this, you can first create a workflow to force those and only those departments involved in publishing process to read and revise the statement, review them conforming laws, correct translations and finally give their consent so the document can be published and ready to be consumed.

The Challenge

When we first looked Windows Azure, we realized that architecture and design stages should now include some cost-efficient strategies. There are some billing drivers you must consider when migrating or creating from scratch your solutions on Windows Azure. Fortunately, Microsoft has provided some extra features and capabilities to make this process much easier. To mention a couple:

There are other tools but you can also take a look at some cost-efficient strategies out there. So that was the main challenge: design, migrate, adapt and write code in a way developers had never done before. Every single stakeholder in the project must now take in consideration one new parameter: cost. We don’t mean Windows Azure is expensive, just the other way around.

Once we had redesigned the core components we faced another challenge. How could we authenticate users in a multi-tenant service? Windows Azure comes with Access Control Service (ACS). ACS lets us deal with identities in a transparent way and focus on the authorization process.

The Architecture

Now we know what Kern4cloud does, we’ll show who is doing it and later we’ll map these with Windows Azure components.

Here’s a list of the main components:

  • K4C.Workplace: The core of the UI. Users can version, edit, publish, delete, batch-operate, sort, search, filter data easily with a single window. Everything is organized in a grid to increase visibility and hence, usability. Refer to first picture to have an idea.
  • K4C.Admin: The place where administrators can manage all the properties, X.Edit styles and mapping, group of users, user permissions, workflows, and so on.
  • Repositories: There are three repositories. One for binary documents (Office, images, videos, etc.), one for the XML files (the ones that contain extracted information and meta-data) and finally all the information handled by K4C.Admin and indexed data used for fast-searching, which is stored in two databases.
  • Workflows: This component deals with all the workflow processes defined by users.

This is an eagle-eye view of the architecture. Let’s see how it’s mapped with Windows Azure components.

Before getting into specifics there’s one thing that must be explained. Kern4Cloud modeling service is offered for two main audiences.

  • Individual users: They have limited disk quota space, some features disabled and share storage resources.
  • Business users: When a company subscribes this model, they are automatically provisioned with a set of private resources. All users inside the company will access the common repository through K4C.Workplace. However, K4C.Admin can help administrators to isolate information within the organization.

Below, there are four figures showing the most important Windows Azure components and services in our architecture. There are some interactions between them which we have been omitted for readability.

  • Figure 1: Shows the Web Roles K4C.Workplace and K4C.Admin. Both are deployed on the same hosted service. They are the entry point of the system and the only ones that have some UI. They retrieve up-to-date information from SQL Azure because our indexing engine and K4C.Workplace component assure it will be there as the user interacts and modifies data. Since Full-Text indexing support is not currently available in SQL Azure, our indexing engine has a little bit of extra work. Recently, we’ve started looking into Hadoop because we believe it could be a good alternative.
  • Figure 2:By the moment, all these roles are deployed on the same hosted service (but different from the ones in Figure 1).
    • Workflows and Index: These are Worker Roles. The first one pops messages from the Workflows Queues and processes them. Indexing worker role is a service that indexes and keeps all the information in a coherent state. It also detects document changes to push messages into Workflows Queues as well. Both are multithreaded in order to minimize bottlenecks. However, the resources of one single instance are limited, so scaling is needed. By the moment, this component is scaling manually but we are preparing a new version that includes the Windows Azure Autoscaling Application Block (WASABi), a component of the Enterprise Library 5.0 Integration Pack for Windows Azure. We plan to autoscale this and other K4C components based on CPU and memory usage and network load.
    • K4C.FileProperties: This is a web service that processes binary files to convert them into XML. Following the Privacy Policy sample, imagine they consist of Word files in your environment with some styles used for headings, footers, etc., which can be mapped with K4C.Admin into XML tags. Once you have saved a new version, if mapping is properly set, the indexing role will send the file to this service and the result will be an XML which can be used for further editing. That is the way you can show your information in many platforms and devices.
    • K4C.Backoffice: It’s the middle-tier between a website and the K4C system. If you want to show your Privacy Policy statement on your website, you’ll have to request it to this service, which will also be shared by any other customers, but it relies on Access Control Service to assure data isolation, so that’s why you’ll first need to authenticate to ACS before making any request.
      Each role in our system has its own scaling parameters. Some should scale based on CPU usage only, others on network usage, and so on. We deploy a role into a specific hosted service if its scaling parameters are similar to the ones previously deployed (though this is not the only rule we follow). K4C.Backoffice is pretended to be consumed by third-party users in the future, so we presume its scaling parameters will differ quite a lot. Hence, we are planning to deploy a new version of this component to its own hosted service.
    • Figure 3: SQL Azure server holds data for every single customer. Individual users share two databases while businesses have their own pair. All sensitive information is properly encrypted in our data layer.
    • Figure 4: We have two storage accounts. One for customers and the other for deploying, diagnostics and backup. XML data is stored in Tables using a partition key for each customer identity, so data will be served from different servers as explained here. We’ve overridden the TableServiceContext class and intercepted the WritingEntity and ReadingEntitity events, so we encrypt and compress xml data before writing to tables and the other way around after consuming from tables.
Windows Azure Marketplace Integration

image_thumb15_thumbFinally we decided to empower Kern4Cloud by filing an application to be published on the Windows Azure Marketplace. We found it to be a perfect destination for a cloud-based solution because it’s a platform where our customers can securely subscribe our services and makes billing a worry of the past. When customers want to subscribe our service, they only need visit the Windows Azure Marketplace and search for Kern4Cloud. After they choose the most suitable offering, the Windows Azure Marketplace will ask them to login with a Windows Live ID account and provide some billing information (such as credit card number). In the Windows Azure Marketplace, subscriptions are billed monthly, so the first month will be immediately charged. Customers will trust the billing process because it’s provided by Microsoft and ISVs will only have to follow some rules and provide their offering prices to the Windows Azure Marketplace. The whole process is fast, secure and reliable for customers and providers.

Furthermore, integrating any solution to the Marketplace is quite easy. Basically you should follow the steps shown in the proper sample found at the Windows Azure Training Kit (WATK). Microsoft has made a big job recollecting good samples and best practices to build Windows Azure-based solutions. You can find the Windows Azure Marketplace integration project separately but I recommend downloading the whole kit.

Let’s see how this works with a sample:

  1. A customer finds your solution in the Windows Azure Marketplace and desires to subscribe. Once the billing and subscription information has been provided, Marketplace redirects the customer to the AzureMarketplaceOAuthHandler.ashx handler on your website, telling a new customer has subscribed.
  2. Your website must confirm the request is truly coming from the Windows Azure Marketplace. This task is handled by a project called AzureMarketplace.OAuthUtility which you’ll find in the WATK. You can either attach the project or reference the DLL. It contains the handler mentioned above, so you should also be adding the following line to your web.config:

<handlers>
<add name="AzureMarketplaceOAuthHandler" verb="*"
path="AzureMarketplaceOAuthHandler.ashx"
type="Microsoft.AzureMarkeplace.OAuthUtility.AuthorizationResponseHandler, Microsoft AzureMarketplace.OAuthUtility"/>

</handlers>

This project relies on Microsoft.IdentityModel and Microsoft.IdentityModel.Protocols.OAuth libraries, as well.

  1. When the origin of the request is confirmed to come from the Windows Azure Marketplace, your solution is told a customer is subscribing and, hence you can ask for additional information if needed. To do this you must add five classes to your website. One of them is used to read the information that purely defines your website as a Windows Azure Marketplace client. This information should be stored in your web.config like this:

<azureMarketplaceConfiguration
appSpecificAzureMarketplaceOAuthClientId="YOUR_CLIENT_ID"
appSpecificAzureMarketplaceOAuthClientSecret = “CLIENT_ECRET_KEY"
appSpecificPostConsentRedirectUrl =
"http://127.0.0.1:81/AzureMarketplaceOAuthHandler.ashx"
appSpecificWellKnownPostConsentUseUri =
"http://127.0.0.1:81/Subscription/New"/>

This information must be the same as the one defined in the Marketplace (whether it is the playground or not).

Of course, a section element must be added, as well:

<section name="azureMarketplaceConfiguration"
type="YOURNAMESPACE.AzureMarketplaceConfiguration, YOURASSEMBLY"
requirePermission="false"/>

Note the testing URLs should be replaced in production environment.

The other important class is SubcriptionUtils.cs. This will be the final destination of the Marketplace requests. Here you will create the CreateSubscription and Unsubscribe methods in order to process all the requests. You only need to add this line to your Global.asax (Application_Start method):

AzureMarketplaceProvider.ConfigureOAuth(new SubscriptionUtils());

And you’re done. Your website is now capable of getting subscriptions requests only from the Marketplace and process them according to your needs.

In a Nutshell

The migration from our on-premises system has not been an easy task, but a thrilling one. Windows Azure provides all the services we needed to accomplish our mission. We only had to take some considerations before. One of them regards some discussions existing on the web, trying to say whether the Cloud means the death of IT. Our experience concludes that IT departments won’t disappear, they’ll just have to get used to these platforms. Furthermore, we believe the debate is going to finish soon with the introduction of hybrid clouds to the mainstream.

We’d like to thank Microsoft for the opportunity of being the first Spanish company to write for this blog series. If you would like more information regarding some of the aspects shown here, feel free to contact us.


Dhananjay Kumar (@debug_mode) described Kendo UI ListView Control and OData in Windows Phone Application in a 5/15/2012 post:

imageIn this post we will see the way to work with Kendo UI Mobile ListView control and OData. Before you go ahead with this post, I recommend you to read Creating First Windows Phone Application using Kendo UI mobile and PhoneGap or Cordova

Using ListView

We can use a ListView control as following. You need to explicitly set data-role for Ul HTML element as listview

image

Resultant listview would be rendered as following in Windows Phone emulator

image

We can create Grouped ListView as following. We need to specify data-type as group.

image

Resultant listview would be rendered as following in Windows Phone emulator

image

If we want to make ListView items as link we can do that as following .

image

Resultant listview would be rendered as following in Windows Phone emulator

image

Working with OData

image

We are going to fetch movies details form OData feed of Netflix. OData feed if Netflix is available at

http://odata.netflix.com/Catalog/Titles

Very first we need to create datasource from OData feed. Datasource from OData feed can be created as following

image

While creating [the] datasource, we are specifying URL of OData feed, type and pagesize need to be fetched from Netflix server. After datasource being created we need to set template and datasource of ListView as following.

$("#lst").kendoMobileListView(
{
template: "<strong>${data.Name}<br/><a href= ${data.Url}></a></strong><br/><img src=${data.BoxArt.MediumUrl} alt=a />",
dataSource: data
});
});

In above code snippet we are setting datasource and template. Template can have HTML elements. Any variable data can be fetched as $datasourcename.fieldname

Complete code is as following. We are having two ListViews on the view to display. First ListView is fetching data from OData feed and second ListView is having hard coded data.

<!DOCTYPE html>
<html>
<head>
<meta name="viewport" content="width=device-width, height=device-height, initial-scale=1.0, maximum-scale=1.0, user-scalable=no;" />
<meta http-equiv="Content-type" content="text/html; charset=utf-8"/>

<title>Cordova WP7</title>
<!-- <link rel="stylesheet" href="master.css" type="text/css" media="screen" title="no title" charset="utf-8"/>-->
<link rel="stylesheet" href="styles/kendo.mobile.all.min.css" type="text/css"/>
<script type="text/javascript" charset="utf-8" src="cordova-1.7.0.js"></script>
<script type="text/javascript" src="js/jquery.min.js"></script>
<script type="text/javascript" src="js/kendo.mobile.min.js"></script>
<script type="text/javascript">

$(document).ready(function () {

var data = new kendo.data.DataSource({
type:"odata", // specifies data protocol
pageSize:10,  // limits result set
transport: {
read: "http://odata.netflix.com/Catalog/Titles"
}
});
$("#lst").kendoMobileListView(
{
template: "<strong>${data.Name}<br/><a href= ${data.Url}></a></strong><br/><img src=${data.BoxArt.MediumUrl} alt=a />",
dataSource: data
});
});
</script>
</head>
<body>
<div id="firstview" data-role="view" data-transition="slide">
<div data-role="header">First View Header</div>
Hello World First View <br />
<ul id="lst" data-role="listview" > </ul>
<ul data-role="listview" data-type="group">
<li> Bloggers
<ul>
<li><a href="#debugmode">debugmode</a></li>
<li><a href="#sqlauthrity">SqlAuthority</a></li>
</ul>
</li>
<li> MVP
<ul>
<li><a href="https://twitter.com">Jacob</a></li>
<li><a href="https://twitter.com">Mahesh</a></li>
</ul>
</li>
</ul>
<div data-role="footer">First View Footer</div>
</div>
<script type="text/javascript">
var app = new new kendo.mobile.Application();
</script>
</body>
</html>

In Windows Phone emulator we should be getting output as following

image

In this way we can work with Kendo UI ListView control. I hope this post is useful.


David Linthicum (@DavidLinthicum) asserted “CIOs will do well, traditional vendors will fare poorly, and users will find a mixed bag” in a deck for his 3 winners, 3 losers in the move to big data article of 5/15/2012 for InfoWorlds’ Cloud Computing blog:

imageThe move to big data is afoot. Recently, Yahoo and Google both tossed their very big hats into the ring, and the cloud computing leaders are already offering access to big data services. It's becoming the killer application for cloud computing, and I believe it will drive a tremendous amount of growth in 2012 and 2013.

imageHowever, with any shift in technology, there are those who win and those who lose. Here are three of each for your consideration.

Winners in the move to big data
1. CIOs.
This group can finally get its enterprise data under control on the cheap and make sense of all the information it's been trying to manage. Although many CIOs have been the target of corporate budget cuts, they finally can put a big mark in the win column.

2. Users. Ever dial into a call center that has no idea who you are or what you mean to the company? Big data lets companies understand their customers at a level once unheard of, including demographics, social circles, and dealings with other businesses. Customers should benefit.

3. Cloud providers. What do you do with an public IaaS cloud? Big data is a good start, with value that's easy to define.

Losers in the move to big data
1. Big data
base vendors. They'll suffer as the market turns to the big data names, which are more open (Hadoop-y, I call it). Proprietary database software will lose some of its attraction, and the movement to big data and the cloud services that deliver database systems for big data will cut right into the proprietary vendors' bottom lines.

2. Data warehouse/BI specialists. They did not see this coming. They've been busy working with traditional analytics technology to manage data sets for business analysis and decision support, including million-dollar hardware and software systems. The movement to big data and cloud computing commoditizes many of their concepts -- and makes these older approaches appear wasteful.

3. Users. The use of big data lets huge amounts of personal information be culled, combined, and analyzed. Thus, everything from who you dated in high school to your buying patterns in college to the number of miles you put on your car each year could be much easier to obtain. Much of this will be through analysis of patterns of massive amounts of available data that you had no idea would provide this kind of visibility into your personal information. Bye-bye, privacy.


<Return to section navigation list>

Windows Azure Service Bus, Access Control, Identity and Workflow

Haishi Bai (@HaishiBai2010) recommended Configure ADFS to show a login page instead of a dialog in a 5/15/2012 post:

Configure ADFS to show a login page instead of a dialog

When you enable ADFS for a website, by default your users will get a login dialog as shown below.

image

imageYou can configure ADFS to bring up a login page instead. To do this, you need to modify web.config file under c:\inetpub\adfs\ls folder (assuming you

image

Configure ADFS to show a login page instead of a dialog

After saving the file, you’ll see a login page instead when you try to login:

image


<Return to section navigation list>

Windows Azure VM Role, Virtual Network, Connect, RDP and CDN

No significant articles today.


<Return to section navigation list>

Live Windows Azure Apps, APIs, Tools and Test Harnesses

Avkash Chauhan (@avkashchauhan) described Windows Azure Management using Burp (Java based GUI tool) and REST API in a 5/16/2012 post:

imageBurp is a great tool to use REST API directly in the GUI. I have written this blog [post] to understand how to configure Burp to Manage Windows Azure subscription and services using REST API.

You can download the tool below:
http://portswigger.net/burp/proxy.html

After you start the tool first step is to setup the PFX certificate which you have deployed to Windows Azure Management Portal.

Open [the] “Options” tab and select your PFX certificate as below:

Now open the “repeater” tab and input the following info:

Host > management.core.windows.net
Check > use SSL
In the request window please enter the REST API and parameters as below:


GET /<You_Subscription_ID>/services/hostedservices HTTP/1.1
Content-Type: text/xml
x-ms-version: 2010-04-01
Host: management.core.windows.net (end of this line please enter 2 times)
Once you have entered REST API and parameters, you see the input request window changed with new tab as “Headers” and when you are ready select “go” to submit the request:

You will see the response as below:



Opstera (@Opstera) reported Opstera Eases Transition to the Cloud With New Dashboard for Monitoring Windows Azure in a 5/16/2012 press release (via Marketwire):

imageToday, Opstera -- a cloud-based health management and capacity optimization provider -- announced a new online dashboard called CloudGraphs, designed to monitor quality of service (QoS) for cloud platforms such as Windows Azure. The dashboard offers Windows Azure customers additional visibility and new metrics related to Microsoft Corp.'s cloud platform in real time, providing a new level of control and assurance. Over time, CloudGraphs will expand to provide data on the availability and performance of additional cloud services. The new dashboard is an important component of Opstera's vision of providing customers with an end-to-end view of their entire cloud operations which includes health management of the cloud platform, the application itself, and the related third-party services.

"As customers make the transition from the datacenter to the cloud, they are concerned about losing control over the operations of their critical business systems," said Opstera CEO Paddy Srinivasan. "CloudGraphs reveals new performance metrics for Windows Azure services, providing customers with vital insights into the overall health of their cloud applications."

imageOpstera's CloudGraphs compiles ongoing data on the performance of Windows Azure services by conducting workload tests against the service on a continual basis, then compares the time required to complete each test. Using a series of algorithms, CloudGraphs calculates latency data for the most popular Windows Azure services: Cloud Services, SQL Azure, and Storage. The dashboard synthesizes the data and displays the information in buttons reflecting available, degraded, or interrupted Windows Azure service. Customers can click the buttons to review historical data for a particular test and learn more information.

"Windows Azure provides on-demand compute, storage, networking, and content delivery, and also automates system management tasks," said Brian Goldfarb, Director of Windows Azure Product Marketing at Microsoft. "Opstera's CloudGraphs gives our mutual customers a deeper view into their mission critical cloud operations. We are excited about future possibilities of this service and the prospect of input from the Windows Azure community."

The dashboard will be submitted to the vibrant Windows Azure community of developers and programmers. These Windows Azure experts can offer valuable feedback to enhance customers' experiences and improve performance data. Community members will tweak the dashboard's existing algorithms -- or suggest new ones -- and provide recommendations for the types and numbers of latency tests in an effort to continually upgrade the tool.

CloudGraphs complements Opstera's existing suite of products and tools designed to provide customers with deeper insights into every aspect of their cloud operations. Based on customer demand for better quality of service in the cloud, CloudGraphs goes beyond application health management by allowing customers to proactively monitor the quality of service of the underlying Windows Azure service. This is a critical part of Opstera's commitment to improve overall cloud operations management for customers.

Opstera's CloudGraphs is a cloud service and is available free of charge. For more information, visit www.CloudGraphs.com or www.opstera.com.

About Opstera

Opstera is the only cloud-based operations management provider for Windows Azure and offers the industry's most comprehensive view of health management and capacity optimization. Opstera provides customers with end-to-end visibility into application performance in the cloud by monitoring the underlying Windows Azure services, the application itself, and the dependent third-party cloud services. By providing a comprehensive set of tools and dashboards, Opstera monitors more than 100 million Windows Azure metrics per month and has helped more than 100 customers gain deeper insights into their cloud operations. For more information, go to www.opstera.com, visit Facebook, or follow Opstera or on Twitter at @opstera.

My (@rogerjenn) updated Links to My Cloud Computing Articles at Red Gate Software’s ACloudyPlace Blog listed Links to My Cloud Computing Articles at Red Gate Software’s ACloudyPlace Blog on 5/15/2012:

imageI’m a regular contributor of articles about cloud computing/big data development and strategy to Red Gate Software’s (@redgate) ACloudyPlace (@ACloudyPlace) blog.

• Updated 5/15/2012 for the latest article.

The following table lists the topics I’ve covered to date:

Date Title
5/15/2012 Free Private Data from Silos for Internal Use with Microsoft CodeName “Data Hub”
4/3/2012 Analyze Years of Air Carrier Flight Arrival Delays in Minutes with the Windows Azure HPC Scheduler
2/22/2012 Track Consumer Engagement and Sentiment with Microsoft Codename “Social Analytics”
1/18/2012 Upload Big Data to SQL Azure Federated Databases with BCP Automatically
12/16/2011 Use OData to Execute RESTful CRUD Operations on Big Data in the Cloud

image

Red Gate Software offers the following “cloud-ready” tools:

imageThe firm acquired Cerebrata Software Private Ltd in October 2011. The press release for the acquisition is here.

Full disclosure: I have gratis licenses for most Cerebrata products, including Cloud Storage Studio and Diagnostics Manager. I use these tools regularly.


<Return to section navigation list>

Visual Studio LightSwitch and Entity Framework 4.1+

The Entity Framework Team announced EF5 Release Candidate Available on NuGet on 5/15/2012:

A couple of months ago we released EF5 Beta 2. Since releasing Beta 2 we have made a number of changes to the code base, so we decided to publish a Release Candidate before we make the RTM available.

This release is licensed for use in production applications. Because it is a pre-release version of EF5 there are some limitations, see the license for more details.

What Changed Since Beta 2?

This Release Candidate includes the following changes:

  • Added a CommandTimeout property to DbMigrationsConfiguration to allow you to override the timeout for applying migrations to the database.
  • We updated Code First to add tables to existing database if the target database doesn’t contain any tables from the model. Previously, Code First would assume that the database contained the correct schema if it was pointed at an existing database that was not created using Code First. Now it checks to see if the database contains any of the tables from the model. If it does, Code First will continue to try and use the existing schema. If not, Code First will add the tables for the model to the database. This is useful in scenarios where a web hoster gives you a pre-created database to use, or adding a Code First model to a database created by the ASP.NET Membership Provider etc.

This release also includes fixes for the following bugs found in Beta 2:

  • Exception in partial trust applications ‘Request for ConfigurationPermission failed while attempting to access configuration section 'entityFramework'.’
  • Migrations: Using a login that has a default schema other than ‘dbo’ for the user causes runtime failures
  • Migrations: DateTime format issue on non-en cultures
  • Migrations: Migrate.exe does not set error code after a failure
  • Migrations: Error renaming entity in many:many relationship
  • Migrations: Checking for Seed override fails in partial trust
  • Migrations: Better error message when startup project doesn't reference assembly with migrations
  • Migrations: ModuleToProcess deprecated in PowerShell 3 causes warning when installing EF NuGet package
What’s New in EF5?

EF 5 includes bug fixes to the 4.3.1 release and a number of new features. Most of the new features are only available in applications targeting .NET 4.5, see the Compatibility section for more details.

  • Enum support allows you to have enum properties in your entity classes. This new feature is available for Model, Database and Code First.
  • Table-Valued functions in your database can now be used with Database First.
  • Spatial data types can now be exposed in your model using the DbGeography and DbGeometry types. Spatial data is supported in Model, Database and Code First.
  • The Performance enhancements that we recently blogged about are included in EF 5.
  • Visual Studio 11 includes LocalDb database server rather than SQLEXPRESS. During installation, the EntityFramework NuGet package checks which database server is available. The NuGet package will then update the configuration file by setting the default database server that Code First uses when creating a connection by convention. If SQLEXPRESS is running, it will be used. If SQLEXPRESS is not available then LocalDb will be registered as the default instead. No changes are made to the configuration file if it already contains a setting for the default connection factory.

The following new features are also available in the Entity Model Designer in Visual Studio 11 Beta:

  • Multiple-diagrams per model allows you to have several diagrams that visualize subsections of your overall model.
  • Shapes on the design surface can now have coloring applied.
  • Batch import of stored procedures allows multiple stored procedures to be added to the model during model creation.
Getting Started

You can get EF 5 Release Candidate by installing the latest pre-release version of the EntityFramework NuGet package.

PM> Install-Package EntityFramework –Pre

These existing walkthroughs provide a good introduction to using the Code First, Model First & Database First workflows available in Entity Framework:

We have created walkthroughs for the new features in EF 5:

Compatibility

This version of the NuGet package is fully compatible with Visual Studio 2010 and Visual Studio 11 Beta and can be used for applications targeting .NET 4.0 and 4.5.

Some features are only available when writing an application that targets .NET 4.5. This includes enum support, spatial data types, table-valued functions and the performance improvements. If you are targeting .NET 4.0 you still get all the bug fixes and other minor improvements.

Support

We are seeing a lot of great Entity Framework questions (and answers) from the community on Stack Overflow. As a result, our team is going to continue spending more time reading and answering questions posted on Stack Overflow.

We would encourage you to post questions on Stack Overflow using the entity-framework tag. We will also continue to monitor the Entity Framework forum.


Beth Massi (@bethmassi) reported "What’s New with LightSwitch in Visual Studio 11" Recording Available on 5/14/2012:

imageOn Friday last week I delivered a webcast on the new LightSwitch features in Visual Studio 11 beta. I think it went pretty well considering it was the first time I had done the session end-to-end :-). You can now view the on-demand webcast here:

Download: What’s New with LightSwitch in Visual Studio 11

(Click the “register” button, log in, and then select “download”.)

image_thumb1In the session we built an application called “Media Mate” that connects to the Netflix OData source to keep track of favorite movies and music. I showed off new features around all our OData work like how to consume external OData services. I also demonstrated how LightSwitch creates its own OData services adhering to your business rules and security settings so that other clients on different platforms can access your middle-tier easily and securely. I showed off some of the new UI enhancements and business types as well as some of the new deployment enhancements. I attached the presentation slides to the bottom of this post.

image

One demo hiccup happened where I messed up the search ability over the Netflix data. Doh! What I ended up doing is unchecking the IsSearchable property on the Title entity in the data designer which subsequently disabled the searching on the screen. So it was user error, not a beta bug ;-). At any rate, I think folks got the point. By the way, there are a lot of things you can do to improve search query performance in LightSwitch and I’ll follow up with a post about that soon.

image

I also demonstrated the updated Contoso Construction sample that you can download here:

Contoso Construction - LightSwitch Advanced Sample (Visual Studio 11 Beta)

For more information on the new features I demonstrated please see:

Also don’t forget to visit the LightSwitch Developer Center, your one-stop-shop for learning all about LightSwitch.


Return to section navigation list>

Windows Azure Infrastructure and DevOps

imageNo significant articles today.


<Return to section navigation list>

Windows Azure Platform Appliance (WAPA), Hyper-V and Private/Hybrid Clouds

DevX.com’s TechNet Spotlight answered Why Get a Microsoft Private Cloud? on 5/15/2012:

imageA Microsoft private cloud puts your applications first. It offers you deep application insight, a comprehensive cross platform approach, best-in-class performance, and the power to run, migrate, or extend your applications out to the public cloud whenever you need.

All About the App

Applications are the lifeblood of your business. The ability to deploy new applications faster and keep them up and running more reliably is the central mission of IT as a competitive differentiator. To gain a real edge, you need to go beyond just managing infrastructure. You need to manage your applications in a new way.

Get Started Now! Download Microsoft Private Cloud Software today.

The Microsoft Private Cloud lets you deliver applications as a service. You can deploy both new and legacy applications on a self-service basis, and manage them across private cloud and public cloud environments. And with a new way to see what’s happening inside the performance of your applications, you can remediate issues faster—before they become show-stoppers. The Microsoft Private Cloud is all about the app. That means better SLA’s, better customer satisfaction, and a new level of agility across the board.

Cross Platform from the Metal Up

No datacenter is an island. Odds are, you run and manage an IT environment today that is deeply heterogeneous, with a wide range of OS, hypervisor, and development tools in the mix. You want to gain the advantages of private cloud computing, but not if it means walking away from your existing IT investments or adding new layers of complexity.

The Microsoft Private Cloud lets you keep what you’ve got and make the move now to a new kind of agility. That’s because it’s architected from the raw metal up to enable process automation and configuration across platforms and environments. You can manage multiple hypervisors, including VMware, Citrix, and Microsoft offerings. You can run and monitor all major operating systems. And you can develop new applications using multiple toolsets. Because the Microsoft Private Cloud provides comprehensive management of heterogeneous IT environments, you can put your business’s needs ahead of the needs of any particular technology or vendor.

Foundation for the Future

Cloud computing offers the promise of agility, economics, and focus that can unlock new innovation and transform the role of IT in driving business success. The game is no longer just about virtualization and server consolidation. A private cloud delivers fundamentally new capabilities that represent a generational paradigm shift in computing. The bet you make today will have long-term implications for the future of your business.

Only Microsoft delivers a complete private cloud solution that provides a true cloud computing platform. For more than 15 years, we have operated some of the world’s biggest and most advanced datacenters and driven the evolution of major Internet services such as Windows Live, Hotmail, and Bing. We’ve taken all that we’ve learned from this unparalleled experience and put it into the DNA of our private cloud solutions. Go beyond virtualization—and unnecessary per-VM licensing—and proceed with confidence in building a secure and manageable private cloud that delivers best-in-class performance for Microsoft workloads (including Exchange, SQL Server and SharePoint), deep management integration, and compelling economics.

Cloud on Your Terms

The move to cloud computing involves more than just building a private cloud. The undeniable benefits of public cloud computing – on-demand scalability, flexibility, and economics, to name a few—also promise significant competitive advantages. The challenge is to leverage your existing investments, infrastructure, and skill sets to build the right mix of private and public cloud solutions for your business—one that will work for you today and in the future.

With Microsoft, you have the freedom to choose. Our solutions are built to give you the power to construct and manage clouds across multiple datacenters, infrastructures, and service providers—on terms that you control. Because Microsoft solutions share a common set of management, identity, virtualization, and development technologies, you can distribute IT across physical, virtual, and cloud computing models. That means you can keep a handle on compliance, security, and costs. And you can let your business needs drive your IT strategy, instead of having IT limit your options.

Microsoft is the sponsor of TechNet Spotlight.


<Return to section navigation list>

Cloud Security and Governance

image_thumbNo significant articles today.


<Return to section navigation list>

Cloud Computing Events

My (@rogerjenn) “Meet Windows Azure” Event Scheduled for 6/7/2012 in San Francisco post of 5/15/2012 appears as follows:

imageThe Windows Azure Team announced on 5/15/2012 “An innovative online and in-person event introducing cloud-based development technologies from Windows Azure” on June 7, 2012 at 1 PM PDT. Click here to open the landing billboard:

image

Register here to attend online and for a chance for a ticket to attend in person:

image

Hope to See You There …


Rick Garibay (@rickggaribay) announced PCSUG.org June Meeting Featuring Glenn Block on Node.js on 5/16/2012:

imageYou asked for it, and here it is.

As a user group that is exclusively focused on distributed computing, the appeal of Node.js to a bunch of messaging guys is obvious. Node.js is new, fascinating and seductive.

After much excitement and buzz around Node.js in general and on Windows Azure in particular, PCSUG.org is thrilled to kick off the summer with Glenn Block, Senior Program Manager on the Node.js SDK for Windows Azure team.

imageJoining us from Redmond, Glenn will be sharing the latest on what the Azure SDK for Windows Azure has to offer, whether you are new to Node.js development or a seasoned veteran.

In addition, our own Developer Evangelist, J. Michael Palermo will help kick things off with a discussion on what the back-end developer needs to know about HTML5 and Windows 8.

Below is the agenda and logistics- we hope you'll join us for a great opportunity to learn about what Node.js on Windows Azure has to offer as well as network and connect with other members of the Phoenix developer community.

6:00 to 8:00 pm: "Unlock your Inner Node.js in the Cloud with Windows Azure" with Glenn Block, Senior Program Manager, Microsoft Corporation

If I told you that you can build node.js applications in Windows Azure would you believe me? Come to this session and I’ll show you how. You’ll see how take those existing node apps and easily deploy them to Windows Azure from any platform. You’ll see how you can make yours node apps more robust by leveraging Azure services like storage and service bus, all of which are available in our new “azure” npm module. You’ll also see how to take advantage of cool tools like socket.io for WebSockets, node-inspector for debugging and Cloud9 for an awesome online development experience.

About Glenn Block

Glenn is a PM at Microsoft working on support for node.js in Windows and Azure. Glenn has a breadth of experience both both inside and outside Microsoft developing software solutions for ISVs and the enterprise. Glenn has been a passionate supporter of open source and has been active in involving folks from the community in the development of software at Microsoft. This has included shipping products under open source licenses, as well as assisting other teams looking to do so. Glenn is also a lover of community and a frequent speaker at local and international events and user groups.

5:30 to 6:00 pm: “What the back-end developer needs to know about HTML5 and Windows 8” with J. Michael Palermo, Senior Developer Evangelist, Microsoft Corporation

Understanding how client applications use and access data is essential for structuring and providing data through services. And just as back-end solutions evolve, so does the client. In this session, you will see how a Windows 8 Metro Style application written in JavaScript and HTML5 is developed to access data via REST APIs and web sockets.

About J. Michael Palermo:

imageJ. “Michael” Palermo IV is a Developer Evangelist with Microsoft. In his years prior to joining Microsoft, Michael served as a Microsoft Regional Director and MVP. Michael has authored several technical books and has published online courses with Pluralsight on HTML5 technologies. Michael continues to share his passion for software development by speaking at developer events around the world.

Registration is required. Please visit pcsug.org for more information and to register for this event: http://pcsug.org/Home/Events


K. Scott Morrison (@KScottMorrison) announced APIs, Cloud and Identity Tour 2012: Three Cities, Two Talks, Two Panels and a Catalyst on 5/15/2012:

imageOn May 15-16 2012, I will be at the Privacy Identity Innovation (pii2012) conference held at the Bell Harbour International Conference Center in Seattle. I will be participating on a panel moderated by Eve Maler from Forrester, titled Privacy, Zero Trust and the API Economy. It will take place at 2:55pm on Tuesday, May 15th:

The Facebook Connect model is real, it’s powerful, and now it’s everywhere. Large volumes of accurate information about individuals can now flow easily through user-authorized API calls. Zero Trust requires initial perfect distrust between disparate networked systems, but are we encouraging users to add back too much trust, too readily? What are the ways this new model can be used for “good” and “evil”, and how can we mitigate the risks?

On Thursday May 17 at 9am Pacific Time, I will be delivering a webinar on API identity technologies, once again with Eve Maler from Forrester. We are going to talk about the idea of zero trust with APIs, an important stance to adopt as we approach what Eve often calls the coming identity singularity–that is, the time when identity technologies and standards will finally line up with real and immediate need in the industry. Here is the abstract for this webinar:

Identity, Access & Privacy in the New Hybrid Enterprise

Making sense of OAuth, OpenID Connect and UMA

In the new hybrid enterprise, organizations need to manage business functions that flow across their domain boundaries in all directions: partners accessing internal applications; employees using mobile devices; internal developers mashing up Cloud services; internal business owners working with third-party app developers. Integration increasingly happens via APIs and native apps, not browsers. Zero Trust is the new starting point for security and access control and it demands Internet scale and technical simplicity – requirements the go-to Web services solutions of the past decade, like SAML and WS-Trust, struggle to solve. This webinar from Layer 7 Technologies, featuring special guest Eve Maler of Forrester Research, Inc., will:

  • Discuss emerging trends for access control inside the enterprise
  • Provide a blueprint for understanding adoption considerations

You Will Learn

  • Why access control is evolving to support mobile, Cloud and API-based interactions
  • How the new standards (OAuth, OpenID Connect and UMA) compare to technologies like SAML
  • How to implement OAuth and OpenID Connect, based on case study examples
  • Futures around UMA and enterprise-scale API access

You can sign up for this talk at the Layer 7 Technologies web site.

Next week I’m off to Dublin to participate in the TMForum Management World 2012. I wrote earlier about the defense catalyst Layer 7 is participating in that explores the problem of how to manage clouds in the face of developing physical threats. If you are at the show, you must drop by the Forumville section on the show floor and have a look. The project results are very encouraging.

I’m also doing both a presentation and participating on a panel. The presentation title is API Management: What Defense and Service Providers Need to Know. Here is the abstract:

APIs promise to revolutionize the integration of mobile devices, on-premise computing and the cloud. They are the secret sauce that allows developers to bring any systems together quickly and efficiently. Within a few years, every service provider will need a dedicated API group responsible for management, promotion, and even monetization of this important new channel to market. And in the defense arena, where agile integration is an absolute necessity, APIs cannot be overlooked.

In this talk, you will learn:

· Why APIs are revolutionizing Internet communications
- And making it more secure
· Why this is an important opportunity for you
· How you can successfully manage an API program
· Why developer outreach matters
· What tools and technologies you must put in place

This talk takes place at the Dublin Conference Centre on Wed May 23 at 11:30am GMT.

Finally, I’m also on a panel organized by my friend Nava Levy from Cvidya. This panel is titled Cloud adoption – resolving the trust vs. uptake paradox: Understanding and addressing customers’ security and data portability concerns to drive uptake.

Here is the panel abstract:

As cloud services continue to grow 5 times faster vs. traditional IT, it seems that also concerns re security and data portability are on the rise. In this session we will explain the roots of this paradox and the opportunities that arise by resolving these trust issues. By examining the different approaches other cloud providers utilize to address these issues, we will see how service providers, by properly understanding and addressing these concerns, can use trust concerns as a competitive advantage against many cloud providers who don’t have the carrier grade trust as one of their core competencies. We will see that by addressing fraud, security, data portability and governances risks heads on, not only the uptake of cloud services will rise to include mainstream customers and conservative verticals, but also the type of data and processes that will migrate to the cloud will become more critical to the customers

The panel is on Thursday, May 24 at 9:50am GMT.

A bit short notice on pii2012, no?


Paul Miller (@PaulMiller) reported CloudCamp reaches Leeds on 14 June on 5/16/2012:

imageThe global CloudCamp movement continues to grow, with events over the next few weeks in Denmark, Germany, Nigeria, Ghana, Kenya, and across the United States. And now, I’m very pleased to announce that the English city of Leeds is joining the party.

Eventbrite - CloudCamp North

CloudCamp events have been taking place in the UK for years, and the London gatherings have picked up real momentum. Outside London, we’ve seen a few events in Warrington, Newcastle, and Edinburgh. We believe that the time is now right for something more regular; a place in which the cloud-building, cloud-using, cloud-interested and cloud-exploring can come together for talk, beer, pizza and more… without having to jump on a train to the deep south.

CloudCamps are interesting events, with a real emphasis on informality. I’ve attended several around the world, and am always impressed by the energy in the room, and by the welcome extended to newcomers. As the main CloudCamp site describes,

“CloudCamp is an unconference where early adopters of Cloud Computing technologies exchange ideas. With the rapid change occurring in the industry, we need a place we can meet to share our experiences, challenges and solutions. At CloudCamp, you are encouraged you to share your thoughts in several open discussions, as we strive for the advancement of Cloud Computing. End users, IT professionals and vendors are all encouraged to participate.”

Jeremy Jarvis at Brightbox and Karyn Fleeting and Joel Turner at Tinderbox Media have been driving this event forward, and they’ve invited me on board to help out. I also get to be MC on the night.

We’ve got speakers and sponsors committed, with more of both to come. If you think you should be one of those doing the speaking or the sponsoring, do let us know.

So, if you like talking Cloud, if your boss has ordered you to learn Cloud, or if you’re just keen to understand a little more about what this Cloud thing can do for you, stick the evening of 14 June in your diary, sign up (for free) on Eventbrite, and come along to the Hilton DoubleTree in Leeds for an evening of fun, learning, beer, and more.

See you there!

Image of the County Arcade in Leeds by Flickr user Francisco Perez


Tim Huckaby (@TimHuckaby) and Michael Collier (@MichaelCollier) produced Bytes by MSDN May 15: Michael Collier on 5/15/2012:

imageJoin Tim Huckaby, Founder of InterKnowlogy and Actus Interactive Software, and Michael Collier, National Architect for Neudesic, discuss Window Azure and Windows Phone. Michael raves about helping customers with their cloud projects and shares his insights into the marriage of cloud and mobility in a cost of effective way as well as his favorite feature Web Roles.

Another great bytes interview! Get Free Cloud Access: Window Azure MSDN Benefits | 90 Day Azure Trial

Video Downloads
WMV (Zip) | WMV | iPod | MP4 | 3GP | Zune | PSP

Audio Downloads
AAC | WMA | MP3 | MP4


<Return to section navigation list>

Other Cloud Computing Platforms and Services

Alex Williams (@alexwilliams) reported SAP Annnounces Free HANA Developer Images on Amazon Web Services in a 5/16/2012 post to the SiliconANGLE blog:

imageOver the past week, SAP and Amazon Web Services announced a number of alliances. Earlier this week, SAP said they would sell Afaria, its mobile management software, on the AWS Marketplace. Last Friday, SAP and AWS announced that SAP customers can now deploy their SAP solutions on SAP EC2 instances in production and non-production environments for both Linux and Windows environments.

Now comes the news that developers may now set up a HANA database on Amazon Web Services at no cost. This is huge. HANA is the in-memory database that SAP is positioning to compete directly with Oracle and SAP.

It’s a major coup for the SAP Mentor community, and people like Vijay Vijayasankar, John Applebee, Jon Reed and Dennis Howlett, who have been pushing SAP for the past two years to take this kind of step.

It’s noteworthy, too, as SAP is now without question the most developer friendly enterprise technology company in the market. [Neither] IBM nor Oracle nor HP can match right now the level of commitment that SAP is making to developers.

Developers who sign up are able to run their own pre-configured HANA instance to create their data analysis application. The signup page includes detailed instructions for the setup.

SAP is making three different sizes available for developers. As noted on the SAP community network site, you can use the AWS pricing calculator, which is pre-configured for 4 hours of daily usage on the smallest available size.


Derrick Harris (@derrickharris) reported Calvin: A fast, cheap database that isn’t a database at all in a 5/16/2012 post to GigaOm’s Structure blog:

imageYale researchers Daniel Abadi and Alexander Thomson A team of Yale researchers think they have developed the cure for Oracle and IBM dominance in the world of database performance, and it isn’t even technically a database. In a blog post Wednesday morning written by team members Daniel Abadi and Alexander Thomson (and in a related research paper), the two researchers detail Calvin, a “transaction scheduling and replication coordination service” that they think can level the playing field between high-cost distributed relational databases and less-expensive, but limited, NoSQL and NewSQL databases.

imageAbadi and ThomsonThe researchers aren’t dismissing either NoSQL or NewSQL, but rather attempting to address the type of use case on which the popular TPC-C database peformance benchmark is based. That benchmark, which simulates an online retail application, requires ACID compliance — which NoSQL options can’t meet — and the ability to update records across database shards in the same transaction — something the authors claim NewSQL databases can’t do.

imageWhy not just stick with Oracle Database and IBM DB2? Cost, especially at scale. As Abadi and Thomson point out in the blog, an Oracle system capable of handling 500,000 transactions per second costs $30 million in hardware and software expenditures.

So, what is Calvin? In a nutshell, it’s software that sits above above a scale-out storage system and turns it into a transaction-processing system by capturing, scheduling and executing transactions. Here’s how Abadi and Thomson describe it in the blog post, allthough the paper goes into much more detail.

Calvin requires all transactions to be executed fully server-side and sacrifices the freedom to non-deterministically abort or reorder transactions on-the-fly during execution. In return, Calvin gets scalability, ACID-compliance, and extremely low-overhead multi-shard transactions over a shared-nothing architecture. In other words, Calvin is designed to handle high-volume OLTP throughput on sharded databases on cheap, commodity hardware stored locally or in the cloud. … Calvin allows user transaction code to access the data layer freely, using any data access language or interface supported by the underlying storage engine (so long as Calvin can observe which records user transactions access).

Calvin, the researchers claim, can match Oracle’s 500,000 transaction-per-second performance running on commodity servers on Amazon EC2. The cost of the resources to run their benchmark was only $300. (Although, obviously, that doesn’t account for the cost of running the system continuously for years, potentially. Commodity physical hardware might be a better bet in the long term.)

Ultimately, Abadi and Thomson the researchers conclude, for transactions that can execute entirely on the server side, Calvin could be the foundation for an end to the current OLTP regime. The world certainly is hungry for something that can do what Oracle and IBM can do, but that costs what NoSQL databases cost (i.e., nothing, often). And Abadi has some distributed database street cred — the HadoopDB project he led is the foundation of Hadapt’s Hadoop-and-data-warehouse hybrid — so, especially if it’s open sourced, one can’t dismiss Calvin out of hand.

Feature image courtesy of Shutterstock user Semisatch.

Related research and analysis from GigaOM Pro:
Subscriber content. Sign up for a free trial.


Jeff Barr (@jeffbarr) announced Domain Verification for the Amazon Simple Email Service on 5/15/2012:

imageThe Amazon Simple Email Service (SES) makes it easy and cost-effective for you to send bulk or transactional email messages.

As I described in my introductory post (Introducing the Amazon Simple Email Service), you must verify the email address (or addresses) that you plan to use to send messages. The initial verification process must be repeated for each email address.

imageToday we are introducing a new SES feature. You can now verify an entire domain, and then send email from any address in that domain. In addition to saving you time and effort, this new feature now allows you to use Amazon SES in situations where you don't accept email at the From address, or when you don't know the From address ahead of time.

You verify a domain by creating a TXT record in the domain's DNS record using information that we provide you as part of the domain verification process. Most (not all) DNS providers allow you to create TXT records.

If you are using Amazon Route 53 to provide DNS service for your domain, the process is very straightforward; you can verify the domain using the AWS Management Console. Here's a tour...

The first step is to visit the SES tab of the console and add your domain to the Domains tab in the Verified Senders section:

If you are not using Route 53, the next step is to update your domain's DNS settings using the TXT record information displayed in the console:

If you are using Route 53, push the Use Route 53 button and select the domains and subdomains that you want to verify:

Either way (Route 53 or your own DNS provider), Amazon SES will verify your domain within 72 hours. Once the domain has been verified, you'll receive an email and the domain will be marked as "verified" in the console.

You can now send email from any address in the domain!

If you would like to learn more about domain verification, please sign up for the June 12th webinar: Using Domain Verification with Amazon Simple Email Service. The webinar is free but space is limited!

We have also changed the limit on the number of verified addresses and domains allowed per AWS account from 100 to 1000.


Martin Tantow (@mtantow) reported GoDaddy Introduces Cloud Servers, Competes with AWS and Rackspace in a 5/14/2012 post to the CloudTimes blog:

imageGoDaddy recently announced its newest offering, Cloud Servers, and competes for the first time with companies like Amazon Web Services and Rackspace. (See an exclusive interview with GoDaddy CEO Warren Adelman below.)

imageApart from a very competitive price point, GoDaddy’s Cloud Solution is designed to give its customers a fast plug & play installation, combines convenient control panels, a strong infrastructure and frequently used features, like firewalls and load balancers.

GoDaddy AWS Rackspace Comparison Cloud GoDaddy Introduces Cloud Servers, Competes with AWS and Rackspace

As the Internet expands and websites become more resource-intensive, many businesses are moving from traditional hosting to more advanced, scalable servers with flexible network options. Go Daddy Cloud Servers are designed for companies looking to take complete control of their Web hosting environment.

In an exclusive interview with CloudTimes (see below), Warren Adelman, CEO of GoDaddy said that “Cloud Server are a natural progression – offering shared hosting with ‘cloudy’ attributes like elasticity, offering virtual servers and dedicated servers. Our new product is the next piece in the hosting landscape – offering Infrastructure-as-a-Service with Cloud Servers that include load balancing, network configuration and easy spinning up and down of environments.”

In an initial test of the service, we have been quite impressed with its intuitive UI and quick setup.

Dashboard Screenshot

godaddy cloud servers 1 1024x509 GoDaddy Introduces Cloud Servers, Competes with AWS and Rackspace

Machine Configuration

godaddy cloud servers 2 1024x486 GoDaddy Introduces Cloud Servers, Competes with AWS and Rackspace

CEO Series Interview

Watch the exclusive interview with GoDaddy CEO, Warren Adelman, about the its new cloud computing offering. Warren was interviewed by Martin Tantow, Founder of CloudTimes, as part of the CloudTimes CEO Series.


<Return to section navigation list>

by Roger Jennings (--rj) (noreply@blogger.com) at May 16, 2012 05:52 PM

ReadWriteCloud

Gravatar

Loomio: Making Better Decisions Remotely Possible

Email, instant messaging, forums, code forges and other collaboration tools make it possible for distributed teams to get work done - but they're not great tools for making decisions. The team behind Loomio wants to solve that with a new Web-based tool for focused, concise discussions that allow all team members to be heard. 

If you've ever worked with a distributed team, you know how difficult it can be to make decisions as a group. Discussions are unstructured, rambling affairs with dozens of messages flying about and no good way to track consensus. Even worse, requests for feedback can go without comment entirely, or with only a few stakeholders raising a voice.

Agree, Disagree, Abstain, Block

Discussion in Loomio starts with a discussion and specific proposal, and members have the option of voting on the proposal. A group can define the options (defaults are yes/no, abstain and block), and each member can give their view summary. As votes are tallied, everyone can see get a chart that shows how many folks are in agreement, how many aren't, how many have abstained, etc.

This sounds pretty simple, but most of today's collaboration tools don't provide a good way to focus a discussion. The key to Loomio is that it provides a central tool for discussions and (if used properly) narrows things down to decisions that are easy to vote on. Central is key here. It helps a lot to confine activity to one tool rather than making users look all over for information.

A lot of online teams communicate in several ways, including email, IM, IRC, over the phone and face to face. Stakeholders who prefer one medium (like email) lose out if discussions are held in IRC, or vice-versa. Even worse, stakeholders may be totally unaware a decision is being made at all. If a group settles on Loomio, it would enable the group to say "decisions are made here and nowhere else." If something isn't put up in Loomio (or another approved tool), then a decision wouldn't be legitimate.

Settling on a decision tool like Loomio should also help cut down on noise in other communication channels. It's popular to have discussions in email and CC everyone who might have an opinion or might need to vote on something. An active team can inspire email fatigue pretty quickly with discussions that are neverending. Loomio would allow users to visit, vote and get back to work.

Actually, Loomio isn't only for distributed teams. There's no reason it couldn't be used in any organization, but its especially appropriate for situations where team members or stakeholders are far-flung.

Can Loomio Solve the Problem?

Like any tool, Loomio would only be effective if used properly. The early design could probably do with some modification - a more obvious start and end date for votes, for example - but the initial design is solid. The Loomio team says it's already in use by some organizations. New Zealand companies or organizations like Enspiral and BuckyBox are among the first adopters - though no one seems to be providing a public instance that we can point to.

If you want to help, the group is looking for contributions from Ruby on Rails developers, as well as a little extra cash (NZ $5,000) to help the volunteer team devote more time to Loomio development. The project is sort-of open source and already on GitHub. It's "sort-of" open source because the site says it's open source, but if you look at the license text on GitHub it's basically a stump saying: "We need to add the license. GPLv2?" The pledge drive (through the Pledge Me platform) ends on May 18th. The developers have already raised more than their target, but more money might mean more time spent on development.

If adopted a bit more widely, Loomio might help take distributed teams to a new level - much like GitHub has helped with development. It is a simple concept, but bringing order to decision-making could help teams communicate better and make better decisions, no matter where they happen to be located. 

by Joe Brockmeier at May 16, 2012 01:33 PM

S3 Storage for WordPress Blogs

Looking to tap Amazon S3 storage for your WordPress blog? The WP2Cloud plugin lets you store all your WordPress data - not just media files - in S3.  

The WP2Cloud plugin was developed by OblakSoft as a solution for Yet Another Picture Sharing Site (Yapixx). Yapixx is provided as a preconfigured Amazon Machine Image (AMI) for EC2 that uses WordPress and several extensions to provide an S3-hosted picture-sharing site.

But you don't have to run Yapixx or use Amazon EC2 at all if you prefer to use hosting elsewhere. All you need is the WP2Cloud plugin and the Cloud Storage Engine for MySQL (ClouSE). Note that ClouSE is mandatory. The plugin will error out if you try to install it without ClouSE available. Naturally, you need an AWS account and an S3 bucket to put files in, too. The full instructions are on the OblakSoft site.

Once it's installed, you can decide whether to go full monty or store only a portion of your content on S3. The benefit of using WP2Cloud is that you take a load off your Web server and let S3 serve up some or all of your content. That includes full posts, if you decide to use ClouSE to put MySQL data in S3 as well. As far as I know, WP2Cloud is the only plugin that puts post data in S3 rather than media only.

While Amazon is the only service that's supported right now, the WP2Cloud documentation indicates that support for other services may be on the horizon. It would be excellent if you could tap other cloud providers or open-source cloud stacks as well.

Other Options

The WP2Cloud plugin might not work well for some users. For example, it requires MySQL 5.5.19 or higher, but plenty of sites have older releases of MySQL. And it might be overkill if you only want to store large media, like videos, in S3 and leave the rest on the WordPress host.

The Amazon S3 for WordPress with CloudFront plugin stores files in S3 transparently and offers the option of using CloudFront. CloudFront is a content distribution network (CDN) that can be used to distribute content more quickly and mitigate traffic spikes.

If you're looking to offload video only to S3, you can use the S3 Video Plugin. It does what it says on the tin, though you may need to tweak some PHP parameters to upload large files to S3.

For sites with minimal traffic (like my personal blog), WP2Cloud is not necessary. But if you're trying to scale WordPress for a lot of traffic, particularly bursty traffic, then you should take a look at some of the cloud storage options to see if they'll help you reduce site load times and server load.

by Joe Brockmeier at May 16, 2012 11:00 AM

May 15, 2012

Amazon Web Services

Gravatar

Domain Verification for the Amazon Simple Email Service

The Amazon Simple Email Service (SES) makes it easy and cost-effective for you to send bulk or transactional email messages.

As I described in my introductory post (Introducing the Amazon Simple Email Service), you must verify the email address (or addresses) that you plan to use to send messages. The initial verification process must be repeated for each email address.

Today we are introducing a new SES feature. You can now verify an entire domain, and then send email from any address in that domain. In addition to saving you time and effort, this new feature now allows you to use Amazon SES in situations where you don't accept email at the From address, or when you don't know the From address ahead of time.

 You verify a domain by creating a TXT record in the domain's DNS record using information that we provide you as part of the domain verification process. Most (not all) DNS providers allow you to create TXT records. 

If you are using Amazon Route 53 to provide DNS service for your domain, the process is very straightforward; you can verify the domain using the AWS Management Console. Here's a tour...

The first step is to visit the SES tab of the console and add your domain to the Domains tab in the Verified Senders section:

If you are not using Route 53, the next step is to update your domain's DNS settings using the TXT record information displayed in the console:

If you are using Route 53, push the Use Route 53 button and select the domains and subdomains that you want to verify:

Either way (Route 53 or your own DNS provider), Amazon SES will verify your domain within 72 hours. Once the domain has been verified, you'll receive an email and the domain will be marked as "verified" in the console.

You can now send email from any address in the domain!

If you would like to learn more about domain verification, please sign up for the June 12th webinar: Using Domain Verification with Amazon Simple Email Service. The webinar is free but space is limited!

We have also changed the limit on the number of verified addresses and domains allowed per AWS account from 100 to 1000. 

-- Jeff;

 

by AWS Evangelist at May 15, 2012 11:26 PM

AWS Cloud Storage for the Enterprise

There are a lot of storage options available to AWS users -- Amazon S3, Elastic Block Storage, and the AWS Storage Gateway, along with ancillary services such as Amazon CloudFront for content distribution, AWS Direct Connect for dedicated network connections, and Amazon Elastic MapReduce for large-scale data processing.

Our customers are using these services to implement cloud-based backup, disaster recovery, and archiving for entire enterprises.

In order to help you make sense of all of these options and to give you a better sense of how AWS can help you, we have created a day-long event devoted to cloud storage for the enterprise.

The AWS Cloud Storage for the Enterprise event will take place on June 6th in New York, and will run from 10:00 AM to 5:30 PM. In addition to keynotes from Stephen Schmidt (Vice President, Security Engineering and Chief Information Security Officer for AWS) and Matt Tavis (AWS Solutions Architect), you will get to hear directly from AWS customers. Stephen will focus on the all-important question of data security in the cloud, and Matt will discuss the ways in which the cloud is transforming the storage strategies that our enterprise customers are putting in to play.

They'll also coever some lesser-known topics such as cloud-as-a-tier, NAS-like functionality with the cloud, long term data archiving with high performance record retrieval, high speed data transfer in and out.

The event is free, but you will need to register.

-- Jeff;

by AWS Evangelist at May 15, 2012 08:01 PM

SearchCloudComputing (Carl Brooks)

Gravatar

What makes a good cloud provider?

Fair costs and SLAs were once enough to discern top cloud providers. In today?s packed cloud market, vendors must bring more to the table.

Add to digg Add to StumbleUpon Add to del.icio.us Add to Google

by Tom Nolle, Contributor(ceo@cimicorp.com at May 15, 2012 04:46 PM

ReadWriteCloud

Gravatar

The Hottest IPO You've Never Heard Of

With an expected valuation of close to $100 billion, it’s understandable that no one can stop talking about Facebook’s initial public offering this week. But while Facebook basks in the social media spotlight, companies tackling tough business problems are exciting investors, if not consumers. Workday, for example, is expected to be among the largest IPOs this year in the business software market.

Founded in 2005, the Pleasanton, California-based Workday makes payroll, accounting and human resources management software available over the Internet to 280 corporations, including big names like Time Warner, Kleenex-maker Kimberly-Clark and giant electronics manufacturer Flextronics. So far, Workday has raised $250 million from venture capital firms and other investors.

Dave Duffield Strikes Again

The company is the brainchild of David Duffield and Aneel Bhusri. Duffield, cofounder and chief executive of PeopleSoft, was forced to sell the company to Oracle in 2005 for $10 billion in a hostile takeover. Duffield and Bhusri, who was vice chairman of PeopleSoft, decided that same year to start rebuilding their company in the cloud.

As of the end of 2011, Workday had more than $300 million in revenue and an estimated value of $2 billion, AllThingsD reported last week. The company expects to launch its IPO in the fourth quarter with the help of bankers Morgan Stanley, Goldman Sachs, Allen & Company and JPMorgan Chase & Co.

Ironically, while the hype has focused on high-profile consumer and social-media IPOs, business-focused tech companies may be a better bet. On the consumer side, as of Friday, the stock of online game maker Zynga had fallen 25% from its IPO price in December last year. Stock of coupon site Groupon has dropped 50% since November 2011.

But Jive Software, which makes social business tools, has seen its stock climb 50% from its IPO price in December 2011. Stock in Guidewire Software, which serves the insurance industry, is up almost 60% since the company’s debut in January.

Workday is going after a market that is expected to soar. Worldwide revenue from delivering business software over the Internet is expected to reach $240 billion in 2020, a six-fold increase from 2010, according to Forrester Research.

Cloud Rules

Companies of all sizes are looking at cloud-based software because it’s often easier and less expensive than deploying and maintaining on-premise applications. In an interview with Bloomberg TV in April, Bhusri claimed Workday’s HR software is half the cost and is much easier to use. “There’s no reason enterprise software needs a training manual,” he said.

Early investors of Workday included Duffield and venture capital firms New Enterprise Associates and Greylock Partners, where Bhusri is a partner. In its last round of funding in December 2011, Workday raised $100 million from investors that included MSD Capital, owned by Dell founder Michael Dell, and Bezos Expeditions, the personal investment entity of Jeff Bezos, founder and chief executive of online retailer Amazon.

Workday is not without competition. German software maker SAP bought rival SuccessFactors last year for $3.4 billion, and Oracle is expected to either buy or develop its way into the HR cloud. In addition, analysts are wondering how long it will be before Workday partner Salesforce.com begins to add competing capabilities to its cloud-based software for sales reps and customer relationship management.

Despite the competition, Workday has built a solid business that investors and some analysts believe could make it a leading player in the HR software market – even though the general public has probably never heard of the company.

Everyone knows Facebook, of course, but analysts are debating whether a company with slowing revenue growth and a potential valuation of 99 times earnings can possibly live up to its hype.

 

Lead image courtesy of Shutterstock.

by Antone Gonsalves at May 15, 2012 12:00 PM

OakLeaf Systems

Gravatar

Links to My Cloud Computing Articles at Red Gate Software’s ACloudyPlace Blog

imageI’m a regular contributor of articles about cloud computing/big data development and strategy to Red Gate Software’s (@redgate) ACloudyPlace (@ACloudyPlace) blog.

• Updated 5/15/2012 for the latest article.

The following table lists the topics I’ve covered to date:

Date Title
5/15/2012 Free Private Data from Silos for Internal Use with Microsoft CodeName “Data Hub”
4/3/2012 Analyze Years of Air Carrier Flight Arrival Delays in Minutes with the Windows Azure HPC Scheduler
2/22/2012 Track Consumer Engagement and Sentiment with Microsoft Codename “Social Analytics”
1/18/2012 Upload Big Data to SQL Azure Federated Databases with BCP Automatically
12/16/2011 Use OData to Execute RESTful CRUD Operations on Big Data in the Cloud

image

Red Gate Software offers the following “cloud-ready” tools:

imageThe firm acquired Cerebrata Software Private Ltd in October 2011. The press release for the acquisition is here.

Full disclosure: I have gratis licenses for most Cerebrata products, including Cloud Storage Studio and Diagnostics Manager. I use these tools regularly.


by Roger Jennings (--rj) (noreply@blogger.com) at May 15, 2012 10:54 AM

Windows Azure and Cloud Computing Posts for 5/14/2012+

A compendium of Windows Azure, Service Bus, EAI & EDI, Access Control, Connect, SQL Azure Database, and other cloud-computing articles. image222

image433

Note: This post is updated daily or more frequently, depending on the availability of new articles in the following sections:


Azure Blob, Drive, Table, Queue and Hadoop Services

Manu Cohen-Yashar (@ManuKahn) described Blob Parallel Upload and Download in a 5/14/2012 post:

imageTo gain the best performance from azure blob storage it is required to upload and download data in parallel. For very small files it is OK to use the simple blob API (UploadFile, UploadFromStream etc.) but for large chucks of data parallel upload is required.

imageTo do parallel upload we'll use a block blob when working with streaming data (such as images or movies) and use the producer consumer design pattern. One thread read the stream, create blocks and put them into a queue. A collection of other threads read blocks from the queue and upload them to the cloud. Once all the threads finished the whole blob is committed.

Lets see the code:

public class PrallelBlobTransfer
    {
        // Async events and properties
        public event EventHandler TransferCompleted;
        private bool TaskIsRunning = false;
        private readonly object _sync = new object();

        // Used to calculate download speeds
        Queue<long> timeQueue = new Queue<long>(100);
        Queue<long> bytesQueue = new Queue<long>(100);

        public CloudBlobContainer Container { get; set; }
       
        public PrallelBlobTransfer(CloudBlobContainer container)
        {
            Container = container;
        }

       
        public void UploadFileToBlobAsync(string fileToUpload, string blobName)
        {
            
            if (!File.Exists(fileToUpload))
                throw new FileNotFoundException(fileToUpload);
            
            var worker = new Action<Stream,string>(ParallelUploadStream);

            lock (_sync)
            {
                if (TaskIsRunning)
                    throw new InvalidOperationException("The control is currently busy.");

                AsyncOperation async = AsyncOperationManager.CreateOperation(null);
                var fs = File.OpenRead(fileToUpload);
                worker.BeginInvoke(fs,blobName, TaskCompletedCallback, async);

                TaskIsRunning = true;
            }
        }
        
        public void UploadDataToBlobAsync(byte[] dataToUpload, string blobName)
        {
            var worker = new Action<Stream, string>(ParallelUploadStream);

            lock (_sync)
            {
                if (TaskIsRunning)
                    throw new InvalidOperationException("The control is currently busy.");

                AsyncOperation async = AsyncOperationManager.CreateOperation(null);
                var ms = new MemoryStream(dataToUpload);
                worker.BeginInvoke(ms, blobName, TaskCompletedCallback, async);

                TaskIsRunning = true;
            }
        }

        public void DownloadBlobToFileAsync(string filePath, string blobToDownload)
        {
            var worker = new Action<Stream,string>(ParallelDownloadFile);

            lock (_sync)
            {
                if (TaskIsRunning)
                    throw new InvalidOperationException("The control is currently busy.");

                AsyncOperation async = AsyncOperationManager.CreateOperation(null);
                var fs = File.OpenWrite(filePath);
                worker.BeginInvoke(fs, blobToDownload, TaskCompletedCallback, async);

                TaskIsRunning = true;
            }
        }
        
        public void DownloadBlobToBufferAsync(byte[] buffer, string blobToDownload)
        {
            var worker = new Action<Stream, string>(ParallelDownloadFile);

            lock (_sync)
            {
                if (TaskIsRunning)
                    throw new InvalidOperationException("The control is currently busy.");

                AsyncOperation async = AsyncOperationManager.CreateOperation(null);
                var ms = new MemoryStream(buffer);
                worker.BeginInvoke(ms, blobToDownload, TaskCompletedCallback, async);

                TaskIsRunning = true;
            }
        }

        public bool IsBusy
        {
            get { return TaskIsRunning; }
        }
        
        // Blob Upload Code
        // 200 GB max blob size
        // 50,000 max blocks
        // 4 MB max block size
        // Try to get close to 100k block size in order to offer good progress update response.
        private int GetBlockSize(long fileSize)
        {
            const long KB = 1024;
            const long MB = 1024 * KB;
            const long GB = 1024 * MB;
            const long MAXBLOCKS = 50000;
            const long MAXBLOBSIZE = 200 * GB;
            const long MAXBLOCKSIZE = 4 * MB;

            long blocksize = 100 * KB;
            //long blocksize = 4 * MB;
            long blockCount;
            blockCount = ((int)Math.Floor((double)(fileSize / blocksize))) + 1;
            while (blockCount > MAXBLOCKS - 1)
            {
                blocksize += 100 * KB;
                blockCount = ((int)Math.Floor((double)(fileSize / blocksize))) + 1;
            }

            if (blocksize > MAXBLOCKSIZE)
            {
                throw new ArgumentException("Blob too big to upload.");
            }

            return (int)blocksize;
        }

        /// <summary>
        /// Uploads content to a blob using multiple threads.
        /// </summary>
        /// <param name="inputStream"></param>
        /// <param name="blobName"></param>
        private void ParallelUploadStream(Stream inputStream,string blobName)
        {
          // the optimal number of transfer threads
            int numThreads = 10;

            long fileSize = inputStream.Length;

            int maxBlockSize = GetBlockSize(fileSize);
            long bytesUploaded = 0;

            // Prepare a queue of blocks to be uploaded. Each queue item is a key-value pair where
            // the 'key' is block id and 'value' is the block length.
            var queue = new Queue<KeyValuePair<int, int>>();
            var blockList = new List<string>();
            int blockId = 0;
            while (fileSize > 0)
            {
                int blockLength = (int)Math.Min(maxBlockSize, fileSize);
                string blockIdString = Convert.ToBase64String(ASCIIEncoding.ASCII.GetBytes(
string.Format("BlockId{0}", blockId.ToString("0000000")))); KeyValuePair<int, int> kvp = new KeyValuePair<int, int>(blockId++, blockLength); queue.Enqueue(kvp); blockList.Add(blockIdString); fileSize -= blockLength; } var blob = Container.GetBlockBlobReference(blobName); blob.DeleteIfExists(); BlobRequestOptions options = new BlobRequestOptions() { RetryPolicy = RetryPolicies.RetryExponential(
RetryPolicies.DefaultClientRetryCount, RetryPolicies.DefaultMaxBackoff), Timeout = TimeSpan.FromSeconds(90) }; // Launch threads to upload blocks. var tasks = new List<Task>(); for (int idxThread = 0; idxThread < numThreads; idxThread++) { tasks.Add(Task.Factory.StartNew(() => { KeyValuePair<int, int> blockIdAndLength; using (inputStream) { while (true) { // Dequeue block details. lock (queue) { if (queue.Count == 0) break; blockIdAndLength = queue.Dequeue(); } byte[] buff = new byte[blockIdAndLength.Value]; BinaryReader br = new BinaryReader(inputStream); // move the file system reader to the proper position inputStream.Seek(blockIdAndLength.Key * (long)maxBlockSize, SeekOrigin.Begin); br.Read(buff, 0, blockIdAndLength.Value); // Upload block. string blockName = Convert.ToBase64String(BitConverter.GetBytes( blockIdAndLength.Key)); using (MemoryStream ms = new MemoryStream(buff, 0, blockIdAndLength.Value)) { string blockIdString = Convert.ToBase64String(
ASCIIEncoding.ASCII.GetBytes(string.Format("BlockId{0}", blockIdAndLength.Key.ToString("0000000")))); string blockHash = GetMD5HashFromStream(buff); blob.PutBlock(blockIdString, ms, blockHash, options); } } } })); } // Wait for all threads to complete uploading data. Task.WaitAll(tasks.ToArray()); // Commit the blocklist. blob.PutBlockList(blockList, options); } /// <summary> /// Downloads content from a blob using multiple threads. /// </summary> /// <param name="outputStream"></param> /// <param name="blobToDownload"></param> private void ParallelDownloadFile(Stream outputStream, string blobToDownload) { int numThreads = 10; var blob = Container.GetBlockBlobReference(blobToDownload); blob.FetchAttributes(); long blobLength = blob.Properties.Length; int bufferLength = GetBlockSize(blobLength); // 4 * 1024 * 1024; long bytesDownloaded = 0; // Prepare a queue of chunks to be downloaded. Each queue item is a key-value pair // where the 'key' is start offset in the blob and 'value' is the chunk length. Queue<KeyValuePair<long, int>> queue = new Queue<KeyValuePair<long, int>>(); long offset = 0; while (blobLength > 0) { int chunkLength = (int)Math.Min(bufferLength, blobLength); queue.Enqueue(new KeyValuePair<long, int>(offset, chunkLength)); offset += chunkLength; blobLength -= chunkLength; } int exceptionCount = 0; using (outputStream) { // Launch threads to download chunks. var tasks = new List<Task>(); for (int idxThread = 0; idxThread < numThreads; idxThread++) { tasks.Add(Task.Factory.StartNew(() => { KeyValuePair<long, int> blockIdAndLength; // A buffer to fill per read request. byte[] buffer = new byte[bufferLength]; while (true) { // Dequeue block details. lock (queue) { if (queue.Count == 0) break; blockIdAndLength = queue.Dequeue(); } try { // Prepare the HttpWebRequest to download data from the chunk. HttpWebRequest blobGetRequest = BlobRequest.Get(blob.Uri, 60, null, null); // Add header to specify the range blobGetRequest.Headers.Add("x-ms-range",
string.Format(System.Globalization.CultureInfo.InvariantCulture, "bytes={0}-{1}",
blockIdAndLength.Key, blockIdAndLength.Key + blockIdAndLength.Value - 1)); // Sign request. StorageCredentials credentials = blob.ServiceClient.Credentials; credentials.SignRequest(blobGetRequest); // Read chunk. using (HttpWebResponse response = blobGetRequest.GetResponse() as HttpWebResponse) { using (Stream stream = response.GetResponseStream()) { int offsetInChunk = 0; int remaining = blockIdAndLength.Value; while (remaining > 0) { int read = stream.Read(buffer, offsetInChunk, remaining); lock (outputStream) { outputStream.Position = blockIdAndLength.Key + offsetInChunk; outputStream.Write(buffer, offsetInChunk, read); } offsetInChunk += read; remaining -= read; Interlocked.Add(ref bytesDownloaded, read); } } } } catch (Exception ex) { // Add block back to queue queue.Enqueue(blockIdAndLength); exceptionCount++; // If we have had more than 100 exceptions then break if (exceptionCount == 100) { throw new Exception("Received 100 exceptions while downloading." + ex.ToString()); } if (exceptionCount >= 100) { break; } } } })); } // Wait for all threads to complete downloading data. Task.WaitAll(tasks.ToArray()); } } private void TaskCompletedCallback(IAsyncResult ar) { // get the original worker delegate and the AsyncOperation instance Action<Stream, string> worker = (Action<Stream, string>)((AsyncResult)ar).AsyncDelegate; AsyncOperation async = (AsyncOperation)ar.AsyncState; // finish the asynchronous operation worker.EndInvoke(ar); // clear the running task flag lock (_sync) { TaskIsRunning = false; } // raise the completed event async.PostOperationCompleted(state => OnTaskCompleted((EventArgs)state), new EventArgs()); } protected virtual void OnTaskCompleted(EventArgs e) { if (TransferCompleted != null) TransferCompleted(this, e); } private string GetMD5HashFromStream(byte[] data) { MD5 md5 = new MD5CryptoServiceProvider(); byte[] blockHash = md5.ComputeHash(data); return Convert.ToBase64String(blockHash, 0, 16); } }

image_thumb1[1]image_thumb3_thumb


<Return to section navigation list>

SQL Azure Database, Federations and Reporting

Haishi Bai (@HaishiBai2010) described how to Deploy Adventure Works Database to SQL Azure using SSMS 2012 in a 5/13/2012 post:

  1. imageFirst, get SSMS 2012. I’m using Microsoft SQL Server 2012 Express, which can be downloaded here.
  2. Then, get the sample database here. Extract the zip file to a folder.
  3. Open a command-line prompt. Go to the extracted folder\Adventureworks.
  4. Issue command CreateAdventureWorksForSQLAzure.cmd <servername> <username> <password>.
  5. imageNow the database should have been created on your local/ SQL Azure instance environment.
  6. Go to SSMS, select master database of your server and use query
    CREATE LOGIN demo WITH PASSWORD = '<your password>'
    to create a new login.
  7. Use query against both master and AdventureWork2012 database to create users
    create user demo from login demo;
    to create a user from the login.
  8. Finally, make demo db_owner (or other privileges you’d like to assign to the user):
    sp_addrolemember 'db_owner', demo;
The <servername> <username> <password> values are those for your SQL Azure server instance. SQL Azure exists only in Microsoft data centers and isn’t supported by local instances.
The original version of this walkthrough is AdventureWorks Community Samples Databases for SQL Azure February 14, 2012 on CodePlex.

<Return to section navigation list>

MarketPlace DataMarket, Social Analytics, Big Data and OData

My (@rogerjenn) Accessing US Air Carrier Flight Delay DataSets on Windows Azure Marketplace DataMarket and “DataHub” post of 5/14/2012 begins:

Contents:

  • imageWindows Azure Marketplace DataMarket
  • Microsoft Codename “Data Hub”
  • Building an OData URL Query and Displaying Data
  • Coming Soon

image_thumb15_thumbThe initial five months (October 2011 through February of US Air Carrier Flight Delays data curated from the U.S. Federal Aviation Administration’s (FAA) On_Time_On_Time_Performance.csv (sic) files is publicly available free of charge in OData and *.csv formats from OakLeaf Systems’ Windows Azure Marketplace DataMarket and Codename “Data Hub” preview sites.

imageAccessing these datasets, which originate from an On_Line_Performance table of the same SQL Azure server instance, for the first time isn’t an altogether intuitive process, so the following two sections describe how to open the datasets with the DataMarkets’ Data Explorer feature. …

imageAnd continues with detailed, illustrated tutorials for registering for and querying the US Air Carrier Flight Delays datasets to return tabular and OData streams.


Chris Woodruff (@cwoodruff) asserted “Microsoft’s newest version of the Open Data Protocol (OData) is something both developers and IT managers should check out” in a deck for his stand-in Why Microsoft’s Open Data Protocol matters article for Mary Jo Foley on 5/14/2012:

imageWith the newest version of the Open Data Protocol (OData), Microsoft is bringing a richer data experience for developers, information workers and data journalists to consume and analyze data from any source publishing with the OData protocol. The goal is not to hide your data and keep it locked away, but to curate the data you provide to your partners, customers and/or the general public. By allowing a curated data experience, you will generate more revenue and allow your data more widespread adoption.

imageTo gain a clearer picture of how this new forum will work, it’s key to understand what the Open Data Protocol is and where it originated. There’s more information about OData at my 31 Days of OData blog series, but the official statement for Open Data Protocol (OData) is that it is a Web protocol for querying and updating data that provides a way to unlock your data and free it from silos that exist in applications today. Really what that means is that we can select, save, delete and update data from our applications just like we have been against relational SQL databases for years. The benefit is the ease of setting up the OData feeds to be utilized from libraries which Microsoft has created for us developers.

An additional benefit comes from the fact that OData has a standard that allows a clear understanding of the data due to the metadata from the feed. Behind the scenes, we send OData requests to a web server which has the OData feed through HTTP calls using the protocol for OData.

OData started back in 2007 at the second Microsoft Mix conference. The announcement was an incubation project codenamed Astoria. The purpose of Project Astoria was to find a way to transport data across HTTP in order to architect and develop web based solutions more efficiently. Not until after the project had time to incubate, did the OData team see patterns occurring which led them to see the vision of the Open Data Protocol. The next big milestone was the 2010 Microsoft Mix Conference where OData was officially announced and proclaimed to the world as a new way to handle data. The rest is history.

Recently, a third version of the OData protocol was announced which will allow developers to produce and consume data, not only to their own desktop applications, web sites and mobile apps, but also open their data up for solutions they may never have intended when creating the OData service, better known as a feed. The next version will include a number of new feature additions for both the server side which hosts the OData feeds, as well as the client side which developers will use to consume the data in their architected solutions.

Here are just a few of the new features:

  • Vocabularies that convey more meaning and extra information to enable richer client experiences.
  • Actions that provide a way to inject behaviors into an otherwise data-centric model without confusing the data aspects of the model.
  • OData version 3 supports Geospatial data and comes with 16 new spatial primitives and some corresponding operations.

An example is my own Baseball Statistics OData feed located here and publicly open to anyone to consume the data. The feed contains the entire 138 years of statistics for Major League Baseball including team, player and post-season stats. My baseball statistics OData feed will be updated to OData v3 very soon and will use many of the new features that were recently announced.

There are many libraries to consume and understand OData for developers to use in their solutions. You can find many of the libraries for your mobile, web and CMS solutions at the OData home site here.

What about the business aspects of OData for organizations that have valuable data they wish to share and wish to generate revenue from? By having data that is easy to consume and understand organizations can allow their customers and partners (via the developers that build the solutions using one or more of the available OData libraries) to leverage the value of curated data that the organization owns. Business customers can either host the data they own and control the consumer experience and subsequent revenue collection, or they can set up your OData feed inside Microsoft’s Windows Azure Marketplace and have Microsoft do the heavy lifting for them, in terms of offering subscriptions to theirr data and collection of subscription fees.

Think of the Windows Azure Datamarket as an App store for data. It’s a great place to generate that needed revenue without having to create the infrastructure beyond the OData feed which surfaces your proprietary data.

In the end, maintaining valuable data in an organization should not solely consist of utilizing databases which are hidden from those outside corporate walls. The data should be curated and allowed to be consumed and even generate revenue for an organization. If you are a developer looking at either producing a method to get data to your applications, or you wish to consume the rich data you see others using in their applications, dig into OData. You will find that it is a great way to become an expert in Data Experience. Furthermore, if you are a manager who is looking for new ways to get your data to the public either for free or to generate additional revenue for your company, explore the exciting world of OData. You just might find some unexpected benefits waiting for you.

Mary Jo is taking a couple of weeks off prior to the “busiest part of Microsoft’s 2012.”

image_thumb15_thumbYou can check out the OData feed of my free US Air Carrier Flight Delays offering by signing up in the Windows Azure Marketplace DataMarket or adding it to your collection in OakLeaf Systems’s Data Hub sample. Run a query and choose DataExplorer’s XML option to display the raw OData content:

image


<Return to section navigation list>

Windows Azure Service Bus, Access Control, Identity and Workflow

imageNo significant articles today.


<Return to section navigation list>

Windows Azure VM Role, Virtual Network, Connect, RDP and CDN

Maarten Balliauw (@maartenballiauw) reminded me that the Windows Azure Content Delivery Network (CDN) has dynamic content capabilities similar to those announced by Amazon Web Services today (see the Other Cloud Computing Platforms and Services section below.) He described them in the “Serving dynamic content through the CDN” section of his Using the Windows Azure Content Delivery Network article of 4/5/2012 for the ACloudyPlace blog:

… Serving dynamic content through the CDN

imageSome dynamic content is static in a sense. For example, generating an image on the server or generating a PDF report based on the same inputs. Why would you generate those files over and over again? This kind of content is a perfect candidate to cache on the CDN as well!

imageImagine you have an ASP.NET MVC action method which generates an image based on a given string. For every different string the output would be different, however if someone uses the same input string the image being generated would be exactly the same.

As an example, we’ll be using this action method in a view to display the page title as an image. Here’s the view’s Razor code:

@{
    ViewBag.Title = "Home Page";
}

<h2><img src="/Home/GenerateImage/@ViewBag.Message" alt="@ViewBag.Message" /></h2>
<p>
    To learn more about ASP.NET MVC visit <a href="http://asp.net/mvc" title="ASP.NET MVC Website">http://asp.net/mvc</a>.
</p>

In the previous section, we’ve seen how an IIS rewrite rule can map all incoming requests from the CDN. The same rule can be applied here: if the CDN requests /cdn/Home/GenerateImage/Welcome, IIS will rewrite this to /Home/GenerateImage/Welcome and render the image once and cache it on the CDN from then on.

As mentioned earlier, a best practice is to specify the Cache-Control HTTP header. This can be done in your action method by using the [OutputCache] attribute, specifying the time-to-live in seconds:

[OutputCache(VaryByParam = "*", Duration = 3600, Location = OutputCacheLocation.Downstream)]
public ActionResult GenerateImage(string id)
{
    // ... generate image ...

    return File(image, "image/png");
}

We would now only have to generate this image once for every different string requested. The Windows Azure CDN will take care of all intermediate caching.

Note that if you’re using some dynamic bundling of CSS and JavaScript like ASP.NET MVC4’s new “bundling and minification”, those minified bundles can also be cached on the CDN using a similar approach.

Conclusion

The Windows Azure CDN is one of the building blocks to create fault-tolerant, reliable, and fast applications running on Windows Azure. By caching static content on the CDN, the web server has more resources available to process other requests, and users will experience faster loading of your applications because content is delivered from a server closer to their location.

Full disclosure: I’m a paid contributor to Red Gate Software’s ACloudyPlace blog.


<Return to section navigation list>

Live Windows Azure Apps, APIs, Tools and Test Harnesses

Brian Swan (@brian_swan, pictured below) posted Azure Real World: Optimizing PHP Applications for the Cloud authored by Ken Muse on 5/14/2012:

imageThis is a guest-post written by Ken Muse, Vice President of Technology at ecoInsight. He has written an amazingly in-depth post about optimizing PHP applications for Windows Azure based on his experience in doing a mixed PHP (Joomla!) and .NET deployment. ecoInsight is using Joomla! running on Windows Azure to manage, maintain, and publish informative articles and industry news to its users. The Joomla! server is used by the company’s content teams to aggregate industry content and publish it to end users in several different vertical markets as AtomPub feeds. The advertisements displayed to users of the ecoInsight Energy Audit & Analysis desktop and mobile platforms are also published using Joomla! In the future, ecoInsight plans to utilize the Joomla! platform for building a collaborative social network and to provide content storage services.

imageMr. Muse is responsible for the overall technical architecture of ecoInsight’s software solutions. Mr. Muse was a senior architect and team lead at SAP, focusing on security, globalization standards, and distributed systems integration. Mr. Muse has a breadth of software and hardware knowledge, including a broad skill set in enterprise and distributed network architecture and multi-platform software development. Mr. Muse is a member of MENSA and IEEE (Institute of Electrical and Electronics Engineers), and a full member of ASHRAE (American Society of Heating, Refrigerating and Air‐Conditioning Engineers).

Optimizing PHP Applications for the Cloud

imageAs a technology company, you are often forced to look for the right balance between available options. Even though our primary product is written using the .NET Framework, we still find it important to take advantage of existing applications and platforms whenever possible. In several cases, the applications we use are written in PHP. While a number of efforts have been made to simplify the task of using PHP on Windows Azure, it is still important to realize that all server-based applications require some level of adjustment to get the best possible performance and reliability. In this article, I wanted to share some of the optimizations and considerations we made to get the best performance from the PHP applications we are hosting on Windows Azure.

PHP Configuration

imageThe master configuration for the PHP runtime is the PHP.INI file. This file controls the numerous settings used by PHP when processing a request. It’s important to make sure you start with a file containing the proper configuration for an IIS deployment. If PHP is installed using the Web Platform Installer (WebPI), this will be automatically configured. If not, the Learn IIS web site provides all of the details here.

One improvement you can make in this file is to remove unnecessary extensions. Each referenced extension library is loaded by the PHP runtime when it is initializing. These libraries then have to register the various functions and constants that are available to the PHP runtime and perform any necessary initialization. By removing any extension which is not used, this additional overhead can be removed.

An important consideration with your PHP configuration is that these settings may need to change based on the PHP deployment. Settings for the server name, database connectivity, and external resources can change quite quickly in the cloud. The use of Windows Azure does not guarantee zero downtime.

On February 29, 2012, Windows Azure experienced a partial service outage, the “Leap-Day Outage”. The partial outage impacted a large number of servers. The Windows Azure team was incredibly transparent about the details of the scope, cause, and fixes involved in this outage. I highly recommend reading their blog article for more information. You can find it at http://blogs.msdn.com/b/windowsazure/archive/2012/03/09/summary-of-windows-azure-service-disruption-on-feb-29th-2012.aspx.

In order to account for this inevitability, your application must be fault tolerant and easily reconfigured. During the leap-day outage, we were able to recover and minimize the impact of the outage. We quickly redeployed our web roles and a read-only snapshot of our database instance to a different data center until the problem was resolved. While some portions of the application were forced to use this read-only database snapshot, users were still able to use the entire application with a minimal disruption. If settings are hard-coded into a PHP file, these kinds of redeployments are difficult. You would be forced to modify your application and rebuild the deployment package. Then, you would need to undo those changes if you needed to relocate your resources back to the original servers. If your settings are obtained dynamically, you can redeploy the package to another server with very little difficulty. There are several strategies which help improve the code in this particular area.

There are several strategies which an application can use for configuring its settings or to configure PHP.INI:

  • The application can receive its settings externally from the csdef using the Windows Azure SDK for PHP. The current implementation of the Windows Azure SDK for PHP uses a command line tool in order to read the service definition settings from Windows Azure. It is therefore important to make sure to take advantage of the caching features in PHP to minimize the frequency of calls to those SDK functions if you use this approach.
  • Rewrite the PHP and application configuration files using a startup task. Combined with the previous method, this can minimize the number of times the SDK is used to retrieve values. The most common way to perform this task is to use a batch file or Windows PowerShell script, although nothing prevents you from invoking the PHP command line.
  • Configure the command-line parameters for the PHP process to pass additional “defines” on the command line. By scripting the FastCGI configuration to pass additional defines (-D) into the PHP runtime with these settings when the role is starting, it is possible to get the flexibility of using external configuration values without the overhead of executing the SDK tools each time a value is needed.
  • Use the PHP Contrib extension, php_azure.dll. This provides a native method for retrieving configuration settings using the published Windows Azure native library APIs. The method azure_getconfig() is able to directly retrieve the configuration settings with minimal overhead; it is the equivalent of calling RoleEnvironment.GetConfigurationSettingValue() from the .NET runtime. It is important to be aware that calls to this function require the PHP runtime to have %RoleRoot%\base\x86 in the PATH. This is allows the native and diagnostics libraries to be loaded by the PHP runtime.
    The extension and its documentation are available from http://phpazurecontrib.codeplex.com.

Whenever possible, you should try to make sure your application can retrieve its configuration settings directly from the role. This provides you the greatest control over the configuration and minimizes any special handling that might be required to deal with restarts or redeployments. Since the machines are stateless, you are not guaranteed of a constant configuration between any two starts.

By using one or more of these techniques, the application can be dynamically pointed to appropriate Windows Azure resources with minimal effort.

Configuring the Web Role

As mentioned previously, the Windows Azure instances require some configuration. There are many ways to deploy PHP, including using the scaffolding framework or the Web Platform Installer (WebPI) command line. Both of these are covered in some detail in numerous blog articles. You can also manually install and configure PHP as part of your deployment.

In our case, we realized that we wanted to ensure that the version of the PHP runtime we were using was completely configured and under our control. By using a specific runtime version, we could adjust the settings more specifically to our needs. In addition, owning the installation would allow us to verify that the extensions we are using will each behave as expected. In more extreme circumstances, this would also allow us to patch the components for any discovered issues. We package the PHP runtime with the deployment and use startup scripts to configure the settings required by IIS, including configuring FastCGI and handlers for the *.php file extension.

There are two important things to remember in the configuration process:

  • The configuration scripts must be idempotent. These startup scripts are not just executed when the Windows Azure instance is first deployed; they can be executed any time Windows Azure restarts or reconfigures the instance. This means that it is important to test the scripts and to make sure that the scripts can handle being run multiple times. Many sample startup scripts for configuring IIS forget this. This can lead to instances failing to launch or suddenly becoming unavailable and refusing to reload. If you are using appcmd to deploy settings to IIS, make sure that the script is either detecting an existing configuration or deleting the configuration section and recreating it.
  • The configuration of the instance can change any time the instance is redeployed, and this configuration change can occur without the server rebooting. When deploying an upgrade, this is especially noticeable. Windows Azure can disconnect the drive hosting the current deployment and connect a new drive containing the upgraded deployment. When this occurs, the paths to the application and any CGI values may need to change in order for the instance to remain usable. This is why idempotent startup scripts are necessary – the scripts may be configuring the PHP runtime to a new location on a completely different drive letter from the previous location. Several early failures on our system could be traced to this particular situation.
Web.Config

Because we are running under IIS, we have to make sure that the application is properly configured. This involves making sure to include a valid web.config file with any additional definitions or settings required by the application. If you do not include a web.config with your application, one is automatically created. If you include a web.config, Windows Azure will modify the file slightly to include various settings used on the server. In some cases, these settings may also reflect changes made using the appcmd tool.

When hosting PHP under IIS – including with Windows Azure – you might find that your code occasionally fails with the generic 500 error message. This message makes it difficult to troubleshoot the issues. In these cases, adding the line <httpErrors errorMode="Detailed" /> can make the initial troubleshooting significantly easier. This will allow the error message returned by PHP to be displayed to the client. Important: It is not recommended that you leave this setting enabled in production since it can expose significant details about your application.

It’s important to remember that this is a base feature of IIS, so you can take advantage of this file to customize the behavior of your application on IIS. If you understand this file and the related configuration settings available on IIS you can further improve the application behavior. For example, in our deployments we commonly enable dynamic and static compression to reduce the bandwidth required by our application. Occasionally we enable the caching mechanisms to fine-tune the caching of documents and files hosted on IIS. These settings and more can be configured in one of two popular ways.

The first is to use the web.config packaged with your deployment. This will allow you to configure most of the common settings associated with your application. For example, one of the more important settings you can configure is the defaultDocument. If this is not properly configured, performance suffers whenever IIS attempts to resolve each request that does not specify a file. IIS is automatically configured with a default list which must be searched in order until either a match is made or a 404 occurs. By explicitly setting your defaultDocument – normally index.php – you can eliminate this search and improve the performance. For more details on configuring the default document, see http://www.iis.net/ConfigReference/system.webServer/defaultDocument.

The second method is to use the appcmd command line management tool for IIS. This is a powerful way to fine-tune your configuration. This method gives you the ability to configure the web application (web.config) as well as the IIS instance (applicationHost.config). Using appcmd allows you to configure logging, dynamic and static compression, FastCGI, and many other aspects of the web role. This is still a very common way to configure the FastCGI settings to host the PHP runtime. If you’re interested in understanding this tool, I would recommend reading http://support.microsoft.com/kb/930909, and http://learn.iis.net/page.aspx/114/getting-started-with-appcmdexe/. In addition, you can refer to http://www.iis.net/ConfigReference/system.webServer/fastCgi for details on configuring FastCGI for PHP.

There are two considerations when using appcmd in your startup script. First, the deployment process may have modified some settings of your web.config file during the configuration of the server. As an example, the deployment process will configure the <machineKey> element. For this reason, do not assume that you will know the exact state of the web.config file. Second, be aware that your startup tasks may be run multiple times. This means that you must make sure any scripts using appcmd are idempotent.

Debugging

When developing or testing in PHP, it’s common to use a debugger extension, such as XDebug. Make sure that you remove these extensions from the production PHP configuration. Debugging extensions can impair the performance of a production server, so they should be used with care.

One recommendation in this regards is to place a debug flag in the service configuration (cscfg) file. You can then use one of the methods described in Configuring the Web Role to modify the PHP.INI at startup to configure the debugging extensions if this flag has been set. This method can also be used to install, enable, and configure other debugging components such as WebGrind.

Don’t forget that it is also possible to use Remote Desktop to connect to an instance and enable debugging manually. If you do this, remember to remove the debugging extensions from PHP.ini or rebuild the instance when you are done!

Logging

Logging can make it much easier to understand the behavior of your application, but it can also slow down the execution of your program. Make sure that you minimize logging on your production instance unless you need it for diagnosing an issue. You can enable logging dynamically from the service configuration file using the same method described in Debugging PHP.

Session Management

In case I haven’t mentioned it – Windows Azure Compute instances are stateless. This means that you cannot guarantee that any two requests will be handled by the same server. In addition, this means that the state on the instance is not guaranteed to be consistent with the state of any other instance. Since a load balancer can direct requests to any instance, it would be ill-advised for an instance to use any form of in-memory session management.

By nature, sessions provide state. This state must be shared and consistent across multiple instances. In order to make this work, you need a consistent and reliable means of storing the shared state. At the moment, Windows Azure Caching does not yet have support for PHP. This means that it is not currently an option for session management with PHP. The current recommendation is to use Windows Azure Table Storage. I would highly recommend reading Brain Swan’s excellent article on this topic, http://blogs.msdn.com/b/silverlining/archive/2011/10/18/handling-php-sessions-in-windows-azure.aspx. Make sure to also read the follow-up article which explains the importance of batching the data being inserted, http://blogs.msdn.com/b/silverlining/archive/2012/01/25/improving-performance-by-batching-azure-table-storage-inserts.aspx. It’s also important to remember that when storing session data, you must account for transient failures. Since Windows Azure uses shared resources, it is possible that resources will be briefly unavailable for short periods of time.

Caching

There is not always a need to dynamically create content for the user. In many cases, resources change infrequently. Caching provides a way of quickly responding to a request with content that already exists. It also provides a way of storing content closer to the user; in some circumstances, the content can be stored in the user’s browser, reducing the number of server requests.

Static Content

If the files are not changing often, it’s a perfect candidate for caching. This type of content can be placed in Window Azure Blob Storage and served using the Windows Azure CDN. This reduces the load on the server and places the content closer to the user. By default, content on the Windows Azure CDN will be cached for 72 hours. If you are not familiar with using the Windows Azure CDN, a hands-on lab is available at http://msdn.microsoft.com/en-us/gg405416.

Output Caching

Output caching preserves dynamic content for a period of time. When the content is requested before the cache expires, IIS will automatically serve the cached content and will not execute the PHP script again. This reduces the number of times a script has to be executed, improving overall performance. This topic is covered in depth in a blog, http://www.microsoft.com/web/post/performance-tuning-php-apps-on-windowsiis-with-output-caching.

File-based Caching

Of course, caching sometimes requires more advanced control. One strategy which we have seen used is to generate content on the server (or directly to Azure Blob Storage). If this file is being presented directly to the user, the page can redirect the user’s browser to this cached content. If the data is one or more database objects, then PHP serialization can be used to store and retrieve the object. Until circumstances change that require the code to create new output, the application can continue to use this cached document.

A word of warning here – if you are using this method, make sure that you are not inadvertently exposing the data publicly. Also, keep in mind that unless you are using Windows Azure Blob Storage, these caches are local to the server. If you are using Windows Azure Blob Storage, remember that there is a change for transient issues, so you must have some form of retry logic to ensure the persistence. Finally, don’t cache anything which relies on synchronization between the instances of the role. Remember that each role is independent and stateless.

If all of this seems quite challenging and risky, that’s because it can be. Of course, this is PHP so there are always more options for caching this data. Two of the most common are Wincache and Memcached.

Wincache

Most PHP veterans with any experience using Windows are familiar with Wincache (and for everyone else, you are already most likely using something similar, APC). This extension increases the speed of PHP applications by caching the scripts and the resulting byte code in memory. This improves the overall performance and reduces the I/O overhead associated with reading the script files. Configuring the extension is quite simple:

  1. Copy php_wincache.dll to the PHP extensions folder
  2. Register the extension in the php.ini file: extension = php_wincache.dll
  3. Optionally, enable the WinCache Functions Reroutes as described here: http://www.php.net/manual/en/wincache.reroutes.php. This improves the performance of certain file-system related calls.
  4. Deploy the application with the new settings. For local testing, restart the Application Pool in IIS so that the change is applied.

You will observe the session.save_handler was not configured to use Wincache. This is one change for Azure that is quite easy to miss! Remember that Windows Azure instances are stateless. The Wincache session management relies on using local memory. As a result, the session state would not exist across multiple servers. For this reason, session management has to use one of the persistence mechanisms provided by the platform.

Memcached

Many PHP applications cache data using Memcached to minimize database access. On Windows Azure this is even more important. SQL Azure is a shared service which has limits on the overall resource utilization. In addition, SQL Azure throttles client connections.

Memcached can be deployed on both Web and Worker roles, and it can be accessed in cluster mode from PHP applications. Maarten Balliauw has made a scaffolder available use with the Windows Azure SDK for PHP at https://github.com/interop-Bridges/Windows-Azure-PHP-Scaffolders/tree/master/Memcached. This site includes details about the implementation and usage instructions.

PHP and SQL Azure

The performance of an application can be limited by the performance of the slowest operation. With our Joomla instance, we are taking advantage SQL Azure for the database support. The performance when using this resource – which is external to the application server – can directly impact the ability of the web role to perform its job. A poorly tuned query can quite easily make the difference between a sub-second response time and an activity time out when dealing with large amounts of data. When a PHP application is performing slowly, in many cases the database queries can be the root of the problem. In a distributed system, this is even more likely to be true since there is more latency. To get the best performance out of your application, you must ensure that any database access is performed as efficiently as possible.

The Driver

If you need to access a database, you need to use drivers. In the past, you might have used the bundled extension to access the mssql functions. Starting with PHP version 5.3, these functions are no longer included with the PHP installation for Microsoft Windows. These functions have been replaced by a new driver from Microsoft which is open-source and available at http://sqlsrvphp.codeplex.com. The new sqlsrv functions are more efficient and vendor-supported. This driver supports both SQL Server and SQL Azure. In general, I recommend trying to keep your drivers up to date so that you can take advantage of bug fixes and platform enhancements. Like any other extension, you will need to place the non-thread safe (NTS) version of the files in your extensions folder and include any required configuration in the PHP.INI file. Full instructions are included with the Getting Started guide. Be aware that version 3.0 of the drivers does not include support for PHP 5.2 and earlier. For that, you will require the older version 2.0 drivers.

We found it quite simple to port legacy code to the new driver; the APIs are nearly identical with the exception of changing the prefix from mssql to sqlsrv. The biggest difference tends to be improved error handling methods in the new APIs: you will need to replace mssql_get_last_message (which returns a single result) with the method sqlsrv_errors (which returns an array of arrays).

From version 2.0 onwards, the drivers include support for PHP Data Objects (PDO). PDO provides an abstraction layer for calling database driver functions. You can read more about PDO in the PHP Manual. While a discussion of PDO is outside the scope of this article, the principals discussed here apply equally whether you are using PDO or not.

Understanding SQL Cursors

To understand how to optimize the PHP code, you must first understand a bit about the cursor types you see in PHP. Each type is optimized for specific uses and has certain performance tradeoffs. Selecting the right type for the job is therefore very important. In most cases, the default cursor type (SQLSRV_CURSOR_FORWARD) is the preferred choice. For more details on these cursor types, I recommend reading the MSDN article: http://msdn.microsoft.com/en-US/library/ee376927(v=SQL90).aspx.

Static

A static cursor (SQLSRV_CURSOR_STATIC) will generally make a copy of the data that will be returned. This is done by creating a work table to store the rows used by the cursor in a special database called tempdb. If there is enough data, it also starts an asynchronous process to populate the work table to improve the performance. This process has performance cost since the database server must retrieve and copy the records.

Dynamic

By comparison, a dynamic cursor (SQLSRV_CURSOR_DYNAMIC) works directly from the tables, avoiding the overhead of copying the data but having an increase in the time it takes to find the data for a single row. Dynamic cursors do not ‘snapshot’ the data typically, meaning it is possible for the underlying data to change between the time the cursor is created and the time the values are read.

Forward Only

The forward-only cursor (SQLSRV_CURSOR_FORWARD) is a specialized type of dynamic cursor which improves the performance by eliminating the need for the cursor to be able to navigate both forwards and backwards through a result set. As a result, a forward-only cursor can only read the current row of data and the rows after it. Once the cursor has moved passed a row, that row is no longer accessible. This is the default cursor type and is ideal for presenting grids or lists of data. Because the results are materialized and sent to the client dynamically, dynamic and forward-only cursors cannot use the sqlsrv_num_rows function.

Keyset

A keyset cursor (SQLSRV_CURSOR_KEYSET) behaves similar to a static cursor, but it only copies the keys for the rows into a keyset in tempdb. This improves performance, but it also allows the non-key values to be updated. Those changes are visible when moving through the data set, similar to a dynamic cursor. If one or more tables lack a unique index, a keyset cursor will automatically become a static cursor. Both keyset and static cursors can use sqlsrv_num_rows to retrieve the number of records in the result set. This means that when calling sqlsrv_num_rows, the database server will be working with a copy of the data stored in tempdb.

Using the Right Cursor

In one of the third-party software components we utilize, the application needed to count the total number of records on the server to implement a paged display. A second query would then requests the current page of data for display. Paging was being used to limit how many records were returned since the dataset could be quite large. This is a fairly common scenario for displaying results in a grid. Both of these queries were originally configured to use a static cursor so that sqlsrv_num_rows could be called. This pattern is quite common in many PHP scripts. Unfortunately, this pattern has a serious flaw.

When the query was invoked to determine the total number of records was invoked, this resulted in the large dataset being copied into tempdb. As the number of records grew, the time required for this processing also increased. This query was not used to generate the actual data view, so none of the data copied into tempdb was used by the client. Before long, this query was taking several seconds to process. It didn’t take long before users began to complain about the time required to view each page.

Fixing this issue is surprisingly simple. First, we converted both queries to use a forward-only cursor. This allowed us to work with the dataset more efficiently since we were no longer copying the records into tempdb. This was also the ideal cursor type for returning the paged results since the grid view was rendering each row in order. For the query which determined the total number of records, the call to sqlsrv_num_rows was replaced by a standard SELECT COUNT query. The modified queries took only a few milliseconds to return their results.

Prepared Statements

Another way to improve your PHP code is to take advantage of prepared statements. Prepared statements basically provide the database server a cacheable template of your query which can then be executed multiple times with different parameters. This reduces the time it takes for the database server to parse the query since the server can reuse the cached query plan. More importantly, prepared statements automatically escape the query parameters. By eliminating string concatenation and the need to escaping the query parameters manually, prepared statements can significantly reduce the risk of a SQL injection attack if used correctly. A final benefit is that prepared statements separate the query template from actual parameters. This can improve the maintainability if the code. It can also be very beneficial if you are trying to support multiple databases!

This feature is more than just a best practice recommendation. It is considered to be so important that it is the only feature PDO drivers are required to emulate if it is not actually supported by the database. You can view a complete example of how to use prepared statements with the SQL Server driver here.

The Tools of the Trade

The Microsoft SQL Server platform provides several tools which can be helpful for maintaining and optimizing your code when dealing with SQL Azure. You can download the current set of tools from http://www.microsoft.com/sqlserver/en/us/get-sql-server/try-it.aspx.

SQL Server Data Tools (SSDT)

SQL Server Data Tools is an integrated collection of tools for managing, maintaining, and debugging SQL scripts. It provides a GUI editor for table definitions, enhanced query debugging, and system for managing and maintaining your scripts in source control. In addition, SSDT has the ability to create a data-tier application (DAC). A DAC provides a way of managing, packaging, and deploying schema changes to SQL Azure. When a DAC is deployed to SQL Azure, it can perform most of the common schema migration tasks for you automatically, drastically reducing the effort required to update your application. In addition, a DAC project can make managing scripts for SQL Azure easier; it provides Intellisense and identifies invalid SQL statements. A hands-on lab is available which walks through creating, managing, and deploying schema changes to SQL Azure using this tool. You can find it at http://msdn.microsoft.com/en-us/hh532119.

You can learn more about SSDT from http://msdn.microsoft.com/en-us/data/gg427686.

SQL Server Profiler

The SQL Server profiler provides a way of capturing and analyzing the queries being sent to a SQL Server instance. By capturing the queries being sent and examining the execution times for each query, you can more easily identify the queries that are taking the most time. Not only that, you can identify additional issues such as repetitive calls and excessive cursor usage.

Using this tool, we noticed that one particular application made a database call for each row that was going to be displayed on the screen. Looking more closely, we noticed that that this query was being used to return a single value from another table. This was accounting for almost 80% of the time required to render that page. By modifying the original query to include a JOIN, we were able to eliminate this overhead and reduce the number of calls to the database.

SQL Server Query Analyzer

Once you’ve identified a query that is taking an excessive amount of time, you need to learn why. The Query Analyzer provides the tools for executing and debugging SQL queries and analyzing the results. This tool also provides a graphical visualization of the query plan to enable you to more effectively find the bottlenecks in your query. This tool is now part of SSDT.

Database Engine Tuning Advisor

If you’ve taken the time to profile your application, you might be interested in learning about optimizations you can make to your schema which might improve the overall performance. The Database Engine Tuning Advisor (DTA) examines your queries and suggests indexes, views, and statistics which can potentially improve the overall performance of your application. Of course you still need to examine whether the proposed changes provide real improvement, but this can be a great help in identifying changes which can improve the overall performance of the application.

Because both SQL Server and SQL Azure have the ability to identify optimal query execution plans, you may find some surprising suggestions. SQL Server can use indexes for schema structures that are not part of the actual query, so this can open up the potential for unexpected optimizations. In our case, a particularly slow query was able to be optimized by creating an indexed view. Because this index provided coverage for the query, SQL Azure was able to use the index and view when retrieving the requested data.

A tutorial for the 2008 R2 edition is available at http://technet.microsoft.com/en-us/library/ms166575(v=sql.105).aspx.

SQL Azure Federations

Federations are the newest addition to the SQL Azure platform. Federated databases provide a way to achieve greater scalability and performance from the database tier of your application. If you’ve done any large scale development on PHP, you will be familiar with the concept of “sharding” the databases. Basically, one or more tables within the database are distributed across multiple databases (referred to as federation members). By separating the data across multiple databases, you can potentially improve the overall performance of your database system. Using SQL Azure Federations via PHP covers this topic in greater detail.

For users that are more familiar with the SQL Server platform and tools, it is worth mentioning that the current tools – Microsoft Visual Studio and SQL Server Data Tools – do not yet support this feature if you are deploying the database as a data-tier application (DAC). As a result, federations must be managed manually at the current time.

Resiliency

The SQL Azure database is a shared resource which can be throttled or restricted based on how it is being used. Because it is a shared resource, it can also be unavailable for short periods of time. While this is not always the case, it is the reality of the cloud. For that reason, your code must not blindly assume that every connection will be successful or that every query will succeed. Two strategies will help you in this regards.

The first strategy is to use a retry policy with effective error handling. That is, if the connection or query fails due to a transient condition, you may create a new connection and attempt to perform the SQL query again. Don’t forget that SQL Azure can become unavailable during a query (closing the connection with an error) or between two queries. Poor error handling can lead to data corruption very easily in a distributed environment.

The other strategy is to isolate your read logic and write logic as much as possible. By separating these two concerns, it becomes possible to continue using your application even if there is a major service outage. In many applications, a large portion of the code is devoted to allowing the user to review stored information. It tends to be a smaller portion of the code which is involved in modifying the data. In these types of applications, a clean separation makes it possible to allow reading to continue independently from write operations. This allows users to continue to use your application, although possibly at a reduced capacity.

During the leap day outage, the SQL Azure instance responsible for serving news feeds to our users had availability issues. During this time, we reconfigured the compute instances which were hosting these services. The new configuration retrieved the data from a copy of the database in a different datacenter and did not allow updating or storing new data. Although we could not create new content or modify existing feeds during this time, our users were able to continue receiving the news feeds.

Sizing the Servers

When deploying to Windows Azure one of the first considerations you have to make is the server size. Windows Azure is not a one-size-fits-all solution. It is a highly configurable set of many different types of services and servers. One of those configuration options is the size of the virtual machine instances that will host the web role. The size of the instance controls the availability of resources such as memory, CPU cores, drive space, and I/O bandwidth; you can read more on the configurations of the available virtual machine sizes at http://msdn.microsoft.com/en-us/library/windowsazure/ee814754.aspx.

This is an important decision for optimizing the performance of your PHP application. The various sizes each have limitations on the available resources – CPU, RAM, disk space, bandwidth, and overall performance – which have to be balanced with the application’s needs. For example, if the application requires extensive network bandwidth, a large server instance may be necessary in order to keep up with the system’s demands. On the other extreme, if the application requires mostly CPU resources and spends significant time in small, blocking operations, it may be advantageous to use multiple small instances so that Windows Azure can load-balance the incoming requests. In each case, the performance is very dependent on how the system is being used.

We have observed that an application running on 2 medium instances can behave very differently from the same application on 4 small instances. In at least one case, the small instances were able to more effectively balance the loads. This appears to be due to more effective load balancing for the application. The load balancer would send each request to the next available instance, preventing the CPU from any one instance from becoming saturated. I would caution that this was a specific case and that you should examine how the resources are being used by your application to understand the most efficient size for your web role. If you are unsure, start with a small instance and work up from there.

Our Stateless World

Remember that the Windows Azure servers are currently designed to be stateless. This means that any changes made locally on the server are not guaranteed to be available. The only way to have persistent storage is to commit the storage to one of the available persistence stores such as a Windows Azure drive, SQL Azure, or blob storage. Windows Azure servers can be reallocated and restarted at any time, so any manual configuration adjustments or changes can be lost unless they are part of your deployment package. For this reason, you cannot make any assumptions about how long local changes will persist. This can be confusing for new developers who might assume that changes made through a Remote Desktop session will continue to work. We have seen cases in which a server suddenly stopped working because a manual change was made to one or more files, and those changes were lost when the instance was redeployed suddenly by the Windows Azure controller in the middle of the night. The same thing can be said for any log files or other persisted content which does not use the proper Windows Azure storage mechanisms – you can’t guarantee the content will not be removed. The servers are stateless.

In one case, we discovered a component of our Joomla installation which was storing image content in a local folder on the server. This had two immediate side effects. First, not all of the servers had copies of this content. This meant that any time we increased the instance count of our Windows Azure deployment, customers would begin to receive 404 errors in the event they were routed to an instance which did not contain the physical file. Since local folders are not synchronized between instances, only the server which initially received the image would be able to serve that content. Second, we observed that the content was permanently lost if the Windows Azure instance was recreated, upgraded, or restarted. The fix to both problems was to make sure that any user provided content was stored directly and subsequently retrieved from Windows Azure’s blob storage. By making this adjustment, we also noticed a substantial performance change when accessing the content. The content was no longer using bandwidth on the server and we could now serve it using the Windows Azure CDN. One additional benefit – the content was now protected by the redundancy built-in to Windows Azure Storage.

Replacing local content storage with remote storage does not come without some cost. It can take longer to transfer a file to remote storage than to the local file system, especially if the server is acting as a gateway to the resources. This can require changes to the end-user experience, or configuring the code to allow the content to be directly uploaded into blob storage. This tradeoff is minor compared to the benefits you can gain. It is also a minor tradeoff compared to the risk of data loss from incorrectly assuming that your content will continue to persist.

Other Windows Azure Features

There are a number of other services available in Windows Azure which can greatly benefit your application. Taking some time to understand these other features can help you to get more out of Windows Azure.

Worker Role

Up until this point, we’ve discussed compute instances using the Web Role almost exclusively. For more advanced development, a Worker Role is also available. The Worker Role allows you to create processes and services which are continuously running and can provide additional features. Worker Roles can be used for calculations, asynchronous processing, notification, scheduling, and numerous other tasks. In short, if you need the equivalent of a background process or daemon, then the Worker Role is ideally suited for performing that job. Worker Roles are a type of compute instance, so they are billed the same way as a Web Role. This means that you still have to consider the related costs.

Windows Azure Blob Storage and CDN

As discussed earlier, one of the first considerations on Windows Azure is the server sizing. Part of sizing the server correctly is to understand the way the server is being used. A server which provides exclusively dynamic content has completely different usage than a server which is mostly static content. The smaller instances of Windows Azure offer much lower bandwidth that the larger instances. One approach is to use a larger server instance if you find you need more bandwidth. This certainly solves the problem, but you are now left with an under-utilized (and rather expensive!) server. A more scalable solution is to move the static content to the Windows Azure Blob storage and enable the CDN. This places the content closer to the end-user, improving the delivery characteristics. It also reduces the amount of network and I/O bandwidth required by the Role. Images, style sheets, static HTML pages, JavaScript, Silverlight XAP files, and other static content types can be placed onto blob storage to allow significantly larger scale at a lower total cost. For applications which are mostly static content, it is possible to use Extra Small instances and effectively host your web role.

Integrated Technologies (Advanced)

Remember that you’re running on platform based on the Windows 2008 server technology. This means that any of the technologies available on that platform are available to you on Windows Azure. From within a compute instance, it is possible to leverage the platform to take advantage of additional features. Be forewarned that if you’re using some of these additional technologies that you must account for the stateless nature of the server, the need for idempotent scripts, and the fact that redeployments can result in your resources moving to a different drive or location.

Some of the features available to you:

  • ASP.NET. If you need a lightweight integration into Windows Azure or have components that run using the .NET runtime, you can use that technology side-by-side with PHP without any issue. For users comfortable with .NET, you can even create startup tasks and handle Role events through event handlers. You’ll likely want the Windows Azure SDK for .NET.
  • Background Startup Tasks. If you don’t need the resiliency and restarting features of a service, but you do need a simple background (daemon) task, then you can use a background startup task in Windows Azure to run a script or executable in the background of your deployment. Background startup tasks are started with the instance and (especially .NET based tasks) can respond to the RoleChange and RoleChanging events. This is the basis of several of the plugins provided with the Windows Azure SDKs.
  • Windows Azure Entry Point. By default, an entry point exists for every Web Role and Worker Role. You are allowed to provide a .NET based DLL containing a customized entry point based on the RoleEntryPoint class in your deployment and to include an <EntryPoint> element in your Service Definition (csdef) which provides the details required by Azure to use the entry point. This allows you to respond to the RoleChanged/RoleChanging events and to control the state of your role. For a VM role, this is not available and the recommendation in the Azure SDK is to use a Windows Service. For more advanced cases, you can even override the Run() method to run background tasks. When the Run() method ends, the Role is restarted. This provides you another mechanism for executing background tasks from a Web Role.
  • IIS. This technology is at the heart of every Web Role and provides the web hosting environment. As a result, all of the features of this platform (including Smooth Streaming support) are available to you. To really make the most of this feature, you’ll want to become intimately familiar with the appcmd tool described in Configuring the Web Role. You’ll also want to make sure to explore WebPI

<Return to section navigation list>

Visual Studio LightSwitch and Entity Framework 4.1+

image_thumb1No significant articles today.


Return to section navigation list>

Windows Azure Infrastructure and DevOps

Bruno Terkaly (@brunoterkaly) posted Essential Windows Azure (Microsoft Cloud) Knowledge : Part 1: Web roles, Worker Roles on 5/13/2012:

Globally distributed data centers
imageThe Windows Azure Platform is big, very big. It is comprehensive and perhaps you could argue it is complex, as all large systems invariably become. I want to use a series of posts to remind me what I “must” bring up during my Azure one-day, in person workshops.

imageIn all seriousness, this post is directed to developers, architects and technical decision makers. Maybe in a future post I'll lower the technical barriers and explain things even more simply. I would argue this post covers the spectrum - from basic to fairly sophisticated.

I assume that you understand the Windows Azure Platform is a cloud-based computing technology from Microsoft, built upon a highly evolved programming environment and hosted in mega-data centers throughout the world.

image Note: It appears to me that the illustration is a bit out of date with only four Windows Azure data centers indicated by the dark circle border. (Amsterdam isn’t included in the Green Energy list, so it appears as if the dark circle borders don’t specify greenness.) East and West US and the two Asian Windows Azure data centers are missing dark borders.
This post is very visual. I want to convey as much as I can with as many diagrams as possible. You obviously can't pull up PowerPoint during Thanksgiving, but if someone asks you for an explanation, having a visual in your head really helps.
I've been doing lectures about cloud computing for a few years now. Along the way I have constructed 100's of slides that explain the Microsoft cloud, the Windows Azure platform. I want to present some of them to you here. It should help you understanding the massive capabilities of the platform as well as explain how some things work.
The basics - hosting web sites and web services
The point of the diagram below is to think about hosting your web-based content and services. It also addresses running background processes.

image
  1. You can think of Compute as being a container for web roles and worker roles.
  2. Compute enables you to run application code in the cloud and allows you to quickly scale your applications. Each Compute instance is a virtual machine that isolates you from other customers
    • Compute runs a Virtual Machine (VM) role
    • Compute automatically includes network load balancing and failover to provide continuous availability.
      • Windows Azure provides a 99.95% monthly SLA for Compute services
  3. Web roles are simply front-end web applications and content hosted inside of IIS in a Microsoft data center.
    • What is IIS?
      • Internet Information Services (IIS) is a web server application and set of feature extension modules that support HTTP, HTTPS, FTP, FTPS, SMTP and NNTP.
      • IIS can host ASP.NET, PHP, HTML5, and Node.js.

        Note that you are not limited to ASP.NET, or MVC. You can also use PHP, Node.js, and HTML5.

    • You can quickly and easily deploy web applications to Web Roles and then scale your Compute capabilities up or down to meet demand.
  4. Web roles can host WCF Services.
    • The Windows Communication Foundation (or WCF), is an application programming interface (API) in the .NET Framework for building connected, service-oriented applications.
    • WCF unifies most distributed systems technologies that developers have successfully used to build distributed applications on the Windows platform over the past decade.
    • WCF supports sending messages using not only HTTP, but also TCP and other network protocols.
    • WCF has built-in support for the latest Web services standards (SOAP 1.2 and WS-*) and the ability to easily support new ones.
    • WCF supports security, transactions and reliability.
    • WCF supports sending messages using formats other than SOAP, such as Representational State Transfer (REST).

ASP.NET Web Forms versus MVC
ASP.NET Web Forms has been around for a while and is a mature technology that runs small and large scale websites alike. MVC is the newer technology that promises many advantages.
  1. Web Forms is built around the Windows Form construction model
    • Web Forms have a declarative syntax with an event driven model.
    • Web Forms allow visual designers can use a drag and drop, WYSIWYG, interface.
    • Web Forms make it possible for you drop controls onto the ASP.NET page and then wire up the events
      • Microsoft basically extended the Visual Basic programming model to the Web
  2. Web Form disadvantages include:
    • Display logic coupled with code, through code-behind files
    • Difficult unit testing because of coupling
    • ViewState and PostBack model
    • State management of controls leads to very large and often unnecessary page sizes

MVC
The ASP.NET MVC Framework is a web application framework that implements the model-view-controller (MVC) pattern.
  1. At the expense of drag and drop, MVC gives you a very granular control over the output of the HTML that is generated.
  2. MVC supports a ‘closer to the metal’ experience to the developers that program with it, by providing full control and testability over the output that is returned to the browser
  3. Clear separation of concerns
    • Results in strong support for unit testing
  4. MVC easily integrates with JavaScript frameworks like jQuery or Yahoo UI frameworks
  5. MVC allows you to map URLs logically and dynamically, depending on your use
  6. MVC provides RESTful interfaces are used by default (this helps out with SEO)

Worker roles are part of compute but are not hosted in IIS.
Applications hosted within Worker roles can run asynchronous, long-running or perpetual tasks independent of user interaction or input.
  1. Worker roles let you host any type of application, including Apache Tomcat and Java Virtual Machines (JVM).
  2. Applications are commonly composed of both Web and Worker roles.
    • A common implementation in Windows Azure takes input from a Web role, sends those requests through a Queue to a Worker role, then processes the requests and stores the output.

    Sample Implementation
    Imagine that you are Microsoft and that you want to offer video encoding services to customers. That means that someone like me can take my home videos, upload them to the Microsoft Cloud, specifically Windows Azure Media Services. Next, I can use a management API that Microsoft provides, and programmatically encode my videos so they can run well on other devices. This simply means I want to take my vacation.mpg video and convert it to a native QuickTime format, like .mov files. Many of you blog readers know that there are many video formats, such as WMV, AVI, MP4, MOV - just to name a few.
    The diagram below illustrates how such an offering might exist. Let's walk through it.
    image
    A sample scenario
    Imagine the user wants to upload their video so they can get it encoded in multiple formats, so the video will look good across a spectrum of devices.
    Let's walk through a scenario.

    imageThe portal that user's interact with is a web role

    1. Note that the web role is the portal. It interacts with the user who wants to user Microsoft's video services.
      • Microsoft could have built the portal using ASP.NET Web Forms, MVC, PHP, HTML5, Node.js. Microsoft probably would choose MVC because of it's testability, and fine-grained control over the rendered HTML to the user.
    2. The portal runs inside of IIS and inside a VM that is running Windows Server 2008 R2.
      • You may have multiple instances running that Azure will automatically load balance requests for.
    3. The web role can interact with the worker role using queues.
    4. The web role takes the user's video and stores inside of Azure Storage, it sends the worker role some instructions about where the .mov files are and what the desired
      • It does so using the Windows Azure Queues.

    Background Process - Worker Role
    image
    1. Like a Windows Service
      • The Worker Role is similar to a windows service.
    2. Long Running
      • It starts up and is running all the time.
    3. No timer
      • Instead of a timer, it uses a simple while(true) loop and a sleep statement.
    4. Background processing
      • This is great for background processing.
    5. Data Required
      • Worker roles usually need some data to work with.
    6. The Queue is the data bridge
      • You can communicate between a worker and a web role via the use of a queue.
        image
    7. Worker role simple reads from queue
      • The worker role doesn’t care how stuff got into the queue
    8. First in First out
      • The worker role processes items in the queue using FIFO.
    9. The user interacts with the web role, not the worker role
      • Generally speaking it is the web role that is user driven and causes data to go into the queue.
    10. The worker role interacts with storage.
      image
    11. The worker role knows there is two types of storage containers
      • There are 3 main categories of storage - 2 Azure Blob Containers and one Azure Table
        • BlobContainer = Movies to Encode
          • Movies that still need to be processed and encoded.
        • BlobContainer = Encoded movies
          • The finished product, multiple movie formats, one for each device type
        • Azure Tables
          • Stores the meta data about the Azure blobs.
          • It records the location of the Azure blobs so the worker role knows where to read and write video content
            • It knows because of the two types of Azure blob containers

    image

    Notes for the diagram above
    Here is some details about he diagram above.

    1. The web role interacts with the user
    2. The user may download or upload files.
    3. The user may upload a video because they want it encoded
      • The web role would be the portal where the user does that
      • But the user may also wish to download the finished product (the encoded video performed by the worker role)
        • The portal must allow downloads from BlobContainer = EncodedMovies
    4. The web role could read/write Azure Tables. But we may choose to let the worker role do that.
      • The web role writes Azure blob locations as text strings to queues and forgets about them.

    image

    Notes for the diagram above.
    Notice many worker and web roles in many racks.
    There are several instances of Fabric Controller instances running in various racks in data centers.

    1. One is elected to act as the primary controller.
      • If it fails, another picks up the slack.
      • There fabric controllers are redundant.
      • If you start a service on Azure, the FC can fall over entirely and your service is not shut down.
    2. The Fabric Controller uses the Preboot eXecution Environment
      • PXE, also known as Pre-Execution Environment; sometimes pronounced "pixie"
        • PXE is an environment to boot computers using a network interface independently of data storage devices (like hard disks) or installed operating systems
        • PXE leverages the Internet Protocol (IP), User Datagram Protocol (UDP), Dynamic Host Configuration Protocol (DHCP) and Trivial File Transfer Protocol (TFTP) to support boostrapping a computer
      • The Fabric Controller runs Sysprep, the system is rebooted as a unique machine

      Understanding the Fabric Controller

      image

      Nice diagrams, Bruno!


      Joseph Fultz wrote 5 Reasons to Start Working with Windows Azure for the May 2012 issue of MSDN Magazine’s Forecast Cloudy column:

      imageEverywhere you turn nowadays, you hear about the cloud—that it’s a major step in the evolution of the Web and will change the way you develop, deploy and manage applications. But not everyone has figured out how the cloud really applies to them. This is especially true for those with medium-to-large infrastructures and relatively flat usage consumption—where the capitalized cost is beneficial compared to the operational cost of the cloud. However, if your infrastructure is on the small side or you have a dynamic consumption model, the cloud—Windows Azure—is a no-brainer. Moreover, for shops heavy in process, where standing up a development environment is like sending a bill to Capitol Hill, Windows Azure can provide a great platform for rapid prototyping.

      imageIt’s with those thoughts in mind that I want to point out some things about Windows Azure that I hope might spur you into putting the magazine down and putting some Windows Azure up.

      Joseph continues with detailed descriptions of the following five reasons:

        1. Great Tools Integration
        2. Performance and Scale
        3. Manageable Infrastructure
        4. You’re Writing the Code Already
        5. It’s the Future

      <Return to section navigation list>

      Windows Azure Platform Appliance (WAPA), Hyper-V and Private/Hybrid Clouds

      Peter Lubanski described Self-Service in System Center 2012 in a 5/14/2012 to Thomas Schinder, MD’s Private Cloud Architecture blog:

      image“Self-Service” is one of the core features of a Private Cloud. Self-Service gives the user the ability to request computing resources for his/her use with minimal or no interaction with their IT staff. In some cases, it may be as simple as browsing to a website, entering some information, such as the number and type of resources needed, then poof the resources are built and ready for use in a few minutes, without anyone on the IT staff having to manually build new machines.image

      And while you might think the idea of requesting and getting IT resources without a long wait and lots of human interaction sounds too good to be true, with System Center 2012 it’s not only is it true, but quite easy to setup and make a reality.

      System Center 2012 comes with 3 options for Self-Service. For the purposes of this article, I’ll classify them into good – better – best and explain why in each section to help you decide on the best approach for your needs.

      Good

      Virtual Machine Manager 2012 Self-Server Portal. This in-box solution is nearly identical to the VMM 2008 R2 Self-Service Portal. The main difference is that it prompts the user for what cloud they want to deploy their virtual machine to, based on what they have permissions to use. Note that This is a basic, simple portal that is not easily customized.

      It can be setup in 4 simple steps:

      1. Install it from the Virtual Machine Manager setup menu
      2. Add user accounts to the self-service user role in VMM
      3. Grant users access to a cloud
      4. Grant users access to deploy a template

      That will give users access to the self-service portal and the ability to create and manage new machines based on the templates they have access to. This “good” solution is easy and fast to setup, but is limited in its ability to customize its experience.

      Better

      AppController 2012 – A new option in System Center 2012 is AppController 2012. This is the new portal that is designed to provide a common self-service experience across private and public clouds that can help you easily configure, deploy, and manage virtual machines and services in both environments.

      imageAppConroller can manage multiple Virtual Machine Manager servers and Azure subscriptions. It also gives the user or application owner a self-service view for deployment and management of services and VMs. It provides a simple diagram view to help a user deploy a new service or VMs, from templates. It also provides the interface to deploy apps and services to Azure and the ability to scale-out existing deployments for better performance. This “better” solution is also easy to setup but provides more functionality for the users for deploying services and managing both private and public clouds from a single interface.

      Best

      Service Manager 2012 Self-Service Portal – A full deployment of System Center 2012 would include Service Manager 2012 and the new Cloud Services Process Pack. Service Manager provides an integrated platform for automating and adapting your organization’s IT service management best practices, such as those found in Microsoft Operations Framework (MOF) and Information Technology Infrastructure Library (ITIL). It provides built-in processes for incident and problem resolution, change control, and asset lifecycle management.

      Service Manager is designed to provide a full IT Service Catalog that can be beyond just virtual machine provisioning. Service Manager has Orchestrator integration to launch runbooks in other System Center components such as Virtual Machine Manager, Data Protection Manager, Operations Manager or Configuration Manager. It is more of a toolkit to build out Service Offerings and Request Offerings that can offer a wide range of self-service options for users – Active Directory requests, security access requests, capacity requests, change requests and more. The possibilities are plentiful given the Orchestrator integration.

      Likewise, the Cloud Services Process Pack is Microsoft’s infrastructure as a service solution built on the System Center platform. With the System Center Cloud Services Process Pack, enterprises can realize the benefits of infrastructure as a service while simultaneously leveraging their existing investments in the Service Manager, Orchestrator, Virtual Machine Manager, and Operations Manager platforms. In a nutshell, the Cloud Services Process Pack gives you a great first step toward realizing the reality of Infrastructure as a Service providing Service Manager components and Orchestrator runbooks. It provides a startup IaaS solution and building blocks to further expand those offerings.

      This “best” solution has the potential to be a rich self-service experience for your entire service catalog. It also includes the capabilities for different approaches to approvals and notifications.

      Summary

      As always, your individual needs will determine what is the best solution for you. Do you need to manage your public clouds? Do you need to customize the experience for your users? Do you need a workflow of approvals and notifications? Fortunately, System Center 2012 provides options to meet your needs no matter where you land and they are all easy to setup and configure. With System Center 2012, you can provide Self Service of IT resources to your users !!

      Also, it is important to note that all 3 components (VMM, AppController and Service Manager) can be controlled through scripts and automation using PowerShell modules or cmdlets. Here are links to that info:

      • Scripting in Virtual Machine Manager - link
      • Using AppController cmdlets - link
      • Cmdlets in System Center 2012 - Service Manager - link

      Thanks!
      Michael Lubanski
      Americas Private Cloud Center of Excellence Lead
      Microsoft Services
      mlubansk@microsoft.com


      <Return to section navigation list>

      Cloud Security and Governance

      Himanshu Singh (@himanshuks) asserted Research Shows Cloud Computing Reduces Time and Money Spent Managing Security in a 5/14/2012 post to the Windows Azure and SQL Azure blogs:

      imageSecurity is often reported as one of the top concerns or barriers to cloud adoption for businesses. To help dispel this perception, today Microsoft released research that shows that small and medium sized businesses (SMB) are gaining significant IT security benefits from using the cloud.

      imageThe study, which was commissioned by Microsoft and conducted by research company comScore, polled SMB companies in the U.S., Singapore, Malaysia, India and Hong Kong markets.

      The U.S. findings show benefits in three main areas:

      • Time Savings: 32 percent of U.S. companies say they spend less time worrying about the threat of cyber-attacks.
      • Money Savings: SMBs that use the cloud are nearly 6x more likely to have decreased the total amount spent on security than SMBs not using the cloud.
      • Improved Security: 35 percent of U.S. companies have experienced noticeably higher levels of security since moving to the cloud.

      Time and money spent managing security prior to using cloud services is being re-invested by SMBs to grow their business and be more competitive.

      You can read more about the study on Microsoft News Center and the Trustworthy Computing Blog.


      Ed Moyle (@securitycurve) described Leveraging Microsoft Azure security features for PaaS security in a 5/14/2012 article for SearchCloudSecurity.com:

      imageAs I discussed in my previous article, application security expertise is critical for PaaS security. Investments in developer education and software development lifecycle processes are imperative for an enterprise using PaaS environments. However, organizations on the whole have been slow to invest in application security.

      imageSo for security pros in a PaaS-heavy environment, here’s the challenge: While application security investments are developed, what short-term measures might be useful as a PaaS security stopgap? For example, in the case of Azure, Microsoft’s numerous development-focused security resources are fantastic but what if the application is already written? There may not be time to incorporate SDL or Threat Modeling for that particular application.

      imageThere are two things that shops in that position should know: First, the Azure environment itself provides some pretty robust security features to protect applications that live there, including measures like network-level controls, physical security, host hardening, etc. But those protections only tell half the story; any environment, no matter how well protected, can still be attacked through application-level issues that aren't addressed. Fortunately, there are a few features that we can layer on once the application is developed that add some measure of protection at the application layer.

      image_thumbThese stopgaps aren’t the only application security measures available (not by a long shot) --they don’t include things you should be doing anyway (like SSL), and they’re not universally applicable to every use case. But these Microsoft Azure security features are useful for security pros to know about because they’re relatively quick to implement, require mostly minor code changes, and can many times get bolted on to an existing application without requiring extensive retesting of business logic.

      Microsoft Azure security: Partial trust

      Windows Azure out-of-the-box provides some level of insulation against attacks of that subvert the application by running user-supplied code as a non-administrator user on the native OS (this is almost entirely transparent from a caller standpoint). However, organizations can further restrict access permissions available to a role by restricting it to “partial” instead of “full trust.”

      Folks familiar with the security model in a traditional .NET context will recognize the concept, but the idea is to restrict the impact of a security failure in the application itself by limiting what the application itself can do. Much like some Web servers and applications employ a “jailed” file system or restricted privilege model, the concept of partial trust is similar. Microsoft provides a full list of functionality available under partial trust and instructions on how to enable it in Visual Studio.

      There is a caveat though. While using partial trust can be a useful avenue to pursue in smaller applications/services, larger or more complicated ones (for example in a direct port from a legacy .NET application) are likely to require the permissions of full trust in order to function.

      Microsoft Azure security: AntiXSS

      Many of the issues that arise within an application context (and more specifically, a Web application context) occur as a result of malicious input; in other words, user-supplied input is a common avenue of attack unless input is constrained or validated as part of application processing. This isn’t easy to do; it generally takes quite a bit of effort (and training) to get developers of business logic to understand what to filter, why, and how to test that filtering is comprehensive.

      Because of this, Microsoft has made freely available the Web Protection Library (WPL) which provides a canned library of input validation that developers can use to help offset some of these issues. The AntiXSS library within the WPL provides capabilities that developers can integrate to encode user input, thereby reducing the likelihood that an attacker could subvert the input field to negatively influence application behavior.

      Microsoft Azure security: Leveraging diagnostics

      The next best thing to being able to prevent an attack is to have some way to know that it happened. In a traditional on-premise application deployment scenario, security professionals might implement enhanced logging and detection controls to offset application-level security risks. This same strategy can be applied to Azure. Specifically, the diagnostics capability of Azure can be configured to provide additional security relevant information beyond just the instrumentation that might already exist within the application itself. IIS logs, infrastructure logs and other logging can be a way to keep an eye on the application once it’s live without the need for extensive planning, coding, and retesting.

      Obviously, these measures aren’t a comprehensive answer for PaaS security. Ideally, the goal for organizations is a long-term sophisticated lifecycle-focused methodology that includes baking in security to the application through SDLC and process changes. But short-term, when it’s hard to get traction for code changes and there’s pressure to get something out the door in a hurry, these quick steps might help.

      Full disclosure: I’m a paid contributor to SearchCloudSecurity.com’s sister publication, SearchCloudComputing.com.


      Researcher” posted to the Clean-Cloud blog on 5/12/2012 (missed when published):

      imageUnited States Health Insurance Portability and Accountability Act of 1996 were intended to protect individuals’ privileges to privacy and confidentiality and security of electronic transfer of personal information.

      1.1.1 Challenges:
      • Physically and logically secure environment
      • Encryption of the data / Mitigation control
      • Auditing, Back-Ups, & Disaster Recovery
      • Access Controls
      • CSP: An ISO/IEC 27001-certificated ISMS will ensure that you are in compliance with the whole range of information-related legislation, including (as applicable) HIPAA, GLBA, SB 1386 and other State breach laws, PIPEDA, FISMA, EU Safe Harbor regulations
      • CSP: SAS 70 is an operational certification to help satisfy HIPAA requirements. SAS 70 checks a lot of things on the HIPAA list.
      image_thumb1.1.2 Solution / Control Objectives in Cloud
      1.1.3 Example

      Amazon Web Services (AWS) provides a reliable, scalable, and inexpensive computing platform “in the cloud” that can be used to facilitate healthcare customers’ HIPAA-compliant applications.


      Researcher” described Cloud Computing and ISO/IEC 27001 in a 5/11/2012 post to the Clean-Cloud blog (missed when published):

      imageISO/IEC 27001 is an Information Security Management System (ISMS) standard published in October 2005 by the International Organization for Standardization (ISO) and the International Electro technical Commission (IEC).

      1.1.1 Challenges:

      image_thumbISO/IEC 27001 requires that management:

      • Establish Information Security Management System
      • Design and implement information security controls and
      • Adopt monitoring and management process to ensure that the information security controls meet the organization’s information security needs
      • Manage identified risks.
      • The 27001 standard does not mandate specific information security controls, but it provides a checklist of controls that should be considered in the accompanying code of practice, ISO/IEC 27002:2005. This second standard describes a comprehensive set of information security control objectives and a set of generally accepted good practice security controls.
      • CSP & Customer: Establish policies for Data Confidentiality, Data Integrity, Availability, Backup & Archive, ownership, classification, decommissioning, and location awareness.
      • CSP & Customer: Use of Cloud monitoring and management products
      • CSP: ISO 27001 auditors may request a SAS 70 Type II report in order to complete their evaluations for customers
      1.1.2 Solution / Control Objectives in Cloud
      1.1.3 Example

      AWS has achieved ISO 27001 certification of our Information Security Management System (ISMS) covering AWS infrastructure, data centers, and services including Amazon EC2, Amazon S3 and Amazon VPC.

      “Researcher” also published Cloud Computing and SAS 70 TYPE-II (5/10/2012)


      <Return to section navigation list>

      Cloud Computing Events

      Sarah Lamb (@MrsActionLamb, @girlygeekdom) reported on 5/14/2012 Scott Guthrie to speak at UK Windows Azure Conference on 6/22/2012 in London, UK:

      imageFor those of you who haven’t yet heard. The Windows Azure conference is coming up and if you building or considering building applications for the cloud, then the 22nd June is a date for your diary.

      imageIt’s a big day for Windows Azure as it will be the first full day, multi-track conference supported by Microsoft and London Windows Azure User Group, featuring content for .NET, PHP, Java and node.js developers as well as device support in Windows 8, iOS, Windows Phone and Android.

      Not to be missed is the keynote speaker Scott Guthrie, the Microsoft Corporate Vice President in charge of the development platform for Windows Azure. [Emphasis added.]

      Tickets are free until 20th May so get them whilst you can.

      Register for the conference here: http://azureconference2012.eventbrite.com/


      HBaseCon reported HBase Conference 2012 will take place on 5/22/2012 at the InterContinental hotel in San Francisco CA:

      Real-Time Your Hadoop

      imageJoin us for HBaseCon 2012, the first industry conference for Apache HBase users, contributors, administrators and application developers.

      Network. Share ideas with colleagues and others in the the rapidly growing HBase community. See who is speaking ›

      Learn. Attend sessions and lightning talks about what’s new in HBase, how to contribute, best practices on running HBase in production, use cases and applications. View the agenda ›

      Train. Make the most of your week and attend Cloudera training for Apache HBase, in the 2 days following the conference. Sign up ›

      Date & Location

      May 22, 2012

      HBase: The Definitive GuideInterContinental San Francisco Hotel


      888 Howard Street
      San Francisco, CA 94103
      Attend and Receive a Free Ebook

      Courtesy of O’Reilly Media, all attendees will receive a voucher for a free ebook of HBase: The Definitive Guide, by author Lars George.


      <Return to section navigation list>

      Other Cloud Computing Platforms and Services

      Joe Brockmeier (@jzb) reported Google Prices its Cloud SQL Offering, Solidifies Cloud Database Market in a 5/14/2012 post to the ReadWriteCloud:

      imageThe cloud database market continues to solidify as Google puts a price tag on its Cloud SQL offering. With actual charges to begin on June 12th, the move finally gives developers a way to see what they'll be spending on Cloud SQL, but comparing Google's offering to Amazon, Microsoft and others might still be a bit tricky.

      imageGoogle's Cloud SQL is MySQL-based and is intended to be used with Google App Engine (GAE). Google's pricing structure is very simple, though not as comprehensive or as expandable as Amazon or others.

      Google has two billing plans: a package plan and a per-use plan. The package plan has four tiers, each of which includes a set amount of RAM, storage and I/O per day. For instance, Google charges $1.46 per day for the D1 tier, which has .5GB of RAM, 1GB of storage and 850,000 I/O requests. The top package (D8) includes 4GB of RAM, 10GB storage and 8 million I/O requests for $11.71 per day.

      The same instances are available on an on-demand basis, starting at $0.10 per hour, with storage and I/O extra.

      The cheapest package from Google, then, runs about $45 a month and the most expensive runs about $357. That doesn't count any overages for I/O or storage.

      Sizing Up Google's Pricing

      Trying to compare Google pricing with Amazon, Azure or databases offered with PaaS services such as Heroku and Engine Yard is tricky, at best. Heroku's database offerings start at $50 per month, but the specs for its database differ considerably from the other providers. For example, Heroku features data clips for developers, and the hstore extension for key/value data storage.

      Amazon's DB instances seem to be a bit more powerful than Google Cloud SQL instances, and Amazon has features that Google Cloud SQL doesn't. For instance, Amazon's Small DB instance has 1.7 GB of RAM and has the equivalent of a single CPU. You're also limited to Google App Engine supported languages, Python and Java.

      Developers can choose between 5GB and 1TB of storage (the max for Google is 10GB storage). The Small DB instance runs about $77 a month, if it's on-demand. But, choosing a one-year reserved instance brings that down to about $45 a month. The pricing, then, seems to line up for the "small" instances for Amazon RDS and Google Cloud SQL, but Google has fewer features and what looks to be less compute power.

      But if you're using GAE, then Cloud SQL is the natural choice - so it's nice to see Google finally getting this into developers' hands. If you're using GAE and Cloud SQL, we'd love to hear what you think.

      Google’s pricing doesn’t appear to me to be competitive with SQL Azure.


      Lydia Leong (@cloudpundit) offers the view of a Gartner Research VP in her Amazon CloudFront gets whole site delivery and acceleration of 5/14/2012:

      imageFor months, there have been an abundance of rumors that Amazon was intending to enter the dynamic site acceleration market; it was the logical next step for its CloudFront CDN. Today, Amazon released a set of features oriented towards dynamic content, described in blog posts from Amazon’s Jeff Barr and Werner Vogels.

      imageWhen CloudFront introduced custom origins (as opposed to the original CloudFront, which required you to use S3 as the origin), and dropped minimum TTLs down to zero, it effectively edged into the “whole site delivery” feature set that’s become mainstream for the major CDNs.

      imageWith this latest release, whole site delivery is much more of a reality — you can have multiple origins so you can mix static and dynamic content (which are often served from different hostnames, i.e., you might have images.mycompany.com serving your static content, but www.mycompany.com serving your dynamic content), and you’ve got pattern-matching rules that let you define what the cache behavior should be for content whose URL matches a particular pattern.

      The “whole site delivery” feature set is important, because it hugely simplifies CDN configuration. Rather than having to go through your site and change its URL references to the CDN (long-time CDN watchers may remember that Akamai in the early days would have customers “Akamaize” their site using a tool that did these URL rewrites), the CDN is smart — it just goes to the origin and pulls things, and it can do so dynamically (so, for instance, you don’t have to explicitly publish to the CDN when you add a new page, image, etc. to your website). It gets you closer to simply being able to repoint the URL of your website to the CDN and having magic happen.

      The dynamic site acceleration features — the actual network optimization features — that are being introduced are much more limited. They basically amount to TCP connection multiplexing, TCP connection peristency/pooling, and TCP window size optimization, much like Cotendo in its very first version. At this current stage, it’s not going to be seriously competing against Akamai’s DSA offering (or CDNetworks’s similar DWA offering), but it might have appeal against EdgeCast’s ADN offering.

      However, I would expect that like everything else that Amazon releases, there will be frequent updates that introduce new features. The acceleration techniques are well known at this point, and Amazon would presumably logically add bidirectional (symmetric POP-to-POP) acceleration as the next big feature, in addition to implementing the common other optimizations (dynamic congestion control, TCP “FastRamp”, etc.).

      What’s important here: CloudFront dynamic acceleration costs the same as static delivery. For US delivery, that starts at about $0.12/GB and goes down to below $0.02/GB for high volumes. That’s easily somewhere between one-half and one-tenth of the going rate for dynamic delivery. The delta is even greater if you look at a dynamic product like Akamai WAA (or its next generation, Terra Alta), where enterprise applications that might do all of a TB of delivery a month typically cost $6000 per app per month — whereas a TB of CloudFront delivery is $120. Akamai is pushing the envelope forward in feature development, and arguably those price points are so divergent that you’re talking about different markets, but low price points also expand a market to where lots of people can decide to do things, because it’s a totally different level of decision — to an enterprise, at that kind of price point, it might as well be free.

      Give CloudFront another year of development, and there’s a high probability that it can become a seriously disruptive force in the dynamic acceleration market. The price points change the game, making it much more likely that companies, especially SaaS providers (many of whom use EC2, and AWS in general), who have been previously reluctant to adopt dynamic acceleration due to the cost, will simply get it as an easy add-on.

      There is, by the way, a tremendous market opportunity out there for a company that delivers value-added services on top of CloudFront — which is to say, the professional services to help customers integrate with it, the ongoing expert technical support on a day to day basis, and a great user portal that provides industry-competitive reporting and analytics. CloudFront has reached the point where enterprises, large mainstream media companies, and other users of Akamai, Limelight, and Level 3 who feel they need ongoing support of complex implementations and a great toolset that helps them intelligently operate those CDN implementations, are genuinely interested in taking a serious look at CloudFront as an alternative, but there’s no company that I know of that provides the services and software that would bridge the gap between CloudFront and a traditional CDN implementation.


      Barb Darrow (@gigabarb) provided a third-party slant on Amazon updates CDN for dynamic content in a 5/14/2012 post to GigaOm’s Structure blog:

      Amazon is updating its Cloudfront content delivery network (CDN) to handle dynamic, interactive web content.

      imageCDNs help web sites ensure that users get the web pages they want faster, typically by caching popular pages closer to likely users. Over the years, CDN providers like market leader Akamai have moved on from static pages — collections of text and photos — to streamed video. Now the battle is all about dynamic or interactive sites — online games for example, that require bursts of traffic flowing back and forth.

      imageAmazon’s CDN has for some time delivered static and streaming content for business customers but relied on partners including Akamai for much more bandwidth intensive dynamic content. (Check out CDN Planet for a good overview of the major CDN players.) According to the Amazon Web Services blog, several changes to Cloudfront should speed up that delivery.

      imageFor example, Cloudfront will now let customers serve content from multiple sources — from Amazon’s own S3 storage service, dynamic content from Amazon EC2, as well as from third-party sites — from a single domain name. That, the company said, simplifies implementation.

      imageBy adding more dynamic delivery capabilities to Cloudfront, Amazon is starting to encroach more on turf of its CDN partners, including Akamai.

      In response to another Cloudfront update, an Akamai source earlier this year told me that he clearly had to watch what Amazon is doing in CDNs but that to date, Cloudfront only dealt with static content — leaving the heavy lifting on dynamic content to Akamai. That is clearly starting to change. Akamai remains the dominant CDN power with more than 1,700 CDN sites on its network, compared to 30 locations for Cloudfront, but it’s clear that Amazon is not content to rest on its laurels. …

      Full disclosure: I’m a registered GigaOm analyst.


      Jeff Barr (@jeffbarr) reported Amazon CloudFront - Support for Dynamic Content in a 5/14/2012 post:

      Introduction
      imageAmazon CloudFront's network of edge locations (currently 30, with more in the works) gives you the ability to distribute static and streaming content to your users at high speed with low latency.

      Today we are introducing a set of features that, taken together, allow you to use CloudFront to serve dynamic, personalized content more quickly.

      What is Dynamic Personalized Content?
      imageAs you know, content on the web is identified by a URL, or Uniform Resource Locator such as http://media.amazonwebservices.com/blog/console_cw_est_charge_service_2.png . A URL like this always identifies a unique piece of content.

      A URL can also contain a query string. This takes the form of a question mark ("?") and additional information that the server can use to personalize the request. Suppose that we had a server at www.example.com, and that can return information about a particular user by invoking a PHP script that accepts a user name as an argument, with URLs like http://www.example.com/userinfo.php?jeff or http://www.example.com/userinfo.php?tina.

      Up until now, CloudFront did not use the query string as part of the key that it uses to identify the data that it stores in its edge locations.

      We're changing that today, and you can now use CloudFront to speed access to your dynamic data at our current low rates, making your applications faster and more responsive, regardless of where your users are located.

      With this change (and the others that I'll tell you about in a minute), Amazon CloudFront will become an even better component of your global applications. We've put together a long list of optimizations that will each increase the performance of your application on their own, but will work even better when you use them in conjunction with other AWS services such as Route 53, Amazon S3, and Amazon EC2.

      Tell Me More
      Ok, so here's what we've done:

      Persistent TCP Connections - Establishing a TCP connection takes some time because each new connection requires a three-way handshake between the server and the client. Amazon CloudFront makes use of persistent connections to each origin for dynamic content. This obviates the connection setup time that would otherwise slow down each request. Reusing these "long-haul" connections back to the server can eliminate hundreds of milliseconds of connection setup time. The connection from the client to the CloudFront edge location is also kept open whenever possible.

      Support for Multiple Origins - You can now reference multiple origins (sources of content) from a single CloudFront distribution. This means that you could, for example, serve images from Amazon S3, dynamic content from EC2, and other content from third-party sites, all from a single domain name. Being able to serve your entire site from a single domain will simplify implementation, allow the use of more relative URLs within the application, and can even get you past some cross-site scripting limitations.

      Support for Query Strings - CloudFront now uses the query string as part of its cache key. This optional feature gives you the ability to cache content at the edge that is specific to a particular user, city (e.g. weather or traffic), and so forth. You can enable query string support for your entire website or for selected portions, as needed.

      Variable Time-To-Live (TTL) - In many cases, dynamic content is either not cacheable or cacheable for a very short period of time, perhaps just a few seconds. In the past, CloudFront's minimum TTL was 60 minutes since all content was considered static. The new minimum TTL value is 0 seconds. If you set the TTL for a particular origin to 0, CloudFront will still cache the content from that origin. It will then make a GET request with an If-Modified-Since header, thereby giving the origin a chance to signal that CloudFront can continue to use the cached content if it hasn't changed at the origin.

      Large TCP Window - We increased the initial size of CloudFront's TCP window to 10 back in February, but we didn't say anything at the time. This enhancement allows more data to be "in flight" across the wire at a given time, without the usual waiting time as the window grows from the older value of 2.

      API and Management Console Support - All of the features listed above are accessible from the CloudFront APIs and the CloudFront tab of the AWS Management Console. You can now use URL patterns to exercise fine-grained control over the caching and delivery rules for different parts of your site.

      Of course, all of CloudFront's existing static content delivery features will continue to work as expected. GET and HEAD requests, default root object, invalidation, private content, access logs, IAM integration, and delivery of objects compressed by the origin.

      Working Together
      Let's take a look at the ways that various AWS services work together to make delivery of static and dynamic content as fast, reliable, and efficient and possible (click on the diagram at right for an even better illustration):

      • From Application / Client to CloudFront - CloudFront’s request routing technology ensures that each client is connected to the nearest edge location as determined by latency measurements that CloudFront continuously takes from internet users around the world. Route 53 may be optionally used as a DNS service to create a CNAME from your custom domain name to your CloudFront distribution. Persistent connections expedite data transfer.
      • Within the CloudFront Edge Locations - Multiple levels of caching at each edge location speed access to the most frequently viewed content and reduce the need to go to your origin servers for cacheable content.
      • From Edge Location to Origin - The nature of dynamic content requires repeated back and forth calls to the origin server. CloudFront edge locations collapse multiple concurrent requests for the same object into a single request. They also maintain persistent connections to the origins (with the large window size). Connections to other parts of AWS are made over high-quality networks that are monitored by Amazon for both availability and performance. This monitoring has the beneficial side effect of keeping error rates low and window sizes high.

      Cache Behaviors
      In order to give you full control over query string support, TTL values, and origins you can now associate a set of Cache Behaviors with each of your CloudFront distributions. Each behavior includes the following elements:

      • Path Pattern - A pattern (e.g. "*.jpg") that identifies the content subject to this behavior.
      • Origin Identifier -The identifier for the origin where CloudFront should forward user requests that match this path pattern.
      • Query String - A flag to enable support for query string processing for URLs that match the path pattern.
      • Trusted Signers - Information to enable other AWS accounts to create signed URLs for this URL path pattern.
      • Protocol Policy - Either allow-all or https-only, also applied only to this path pattern.
      • MinTTL - The minimum time-to-live for content subject to this behavior.

      Tool Support
      Andy from CloudBerry Lab sent me a note to let me know that they have added dynamic content support to the newest free version of the CloudBerry Explorer for Amazon S3. In Andy's words:

      I'd like to let you know that CloudBerry Explorer is ready to support new CloudFront features by the time of release. We have added the ability to manage multiple origins for a distribution, configure cache behavior for each origin based on URL path patterns and configure CloudFront to include query string parameters.

      He also sent some screen shots to show us how it works. The first step is to specify the Origins and CNAMEs associated with the distribution:

      The next step is to specify the Path Patterns:

      With the Origins and Path Patterns established, the final step is to configure the Path Patterns:

      And Here You Go
      Together with CloudFront's cost-effectiveness (no minimum commits or long-term contracts), these features add up to a content distribution system that is fast, powerful, and easy to use.

      So, what do you think? What kinds of applications can you build with these powerful new features?


      Werner Vogels (@werner) described Dynamic Content Support in Amazon CloudFront on 5/13/2012:

      imageIn the past three and a half years, Amazon CloudFront has changed the content delivery landscape. It has demonstrated that a CDN does not have to be complex to use with expensive contracts, minimum commits, or upfront fees, such that you are forcibly locked into a single vendor for a long time. CloudFront is simple, fast and reliable with the usual pay-as-you-go model. With just one click you can enable content to be distributed to the customer with low latency and high-reliability.

      imageToday Amazon CloudFront has taken another major step forward in ease of use. It now supports delivery of entire websites containing both static objects and dynamic content. With these features CloudFront makes it as simple as possible for customers to use CloudFront to speed up delivery of their entire dynamic website running in Amazon EC2/ELB (or third-party origins), without needing to worry about which URLs should point to CloudFront and which ones should go directly to the origin.

      Dynamic Content Support

      Recall that last month the CloudFront team announced lowering the minTTL customers can set on their objects, down to as low as 0 seconds to support delivery of dynamic content. In addition to the TTLs, customers also need some other features to deliver dynamic websites through CloudFront. The first set of features that CloudFront is launching today include:

      Multiple Origin Servers: the ability to specify multiple origin servers, including a default origin, for a CloudFront download distribution. This is useful when customers want to use different origin servers for different types of content. For example, an Amazon S3 bucket can be used as the origin for static objects and an Amazon EC2 instance as the origin for dynamic content, all fronted by the same CloudFront distribution domain name. Of course non-AWS origins are also permitted.

      Query String based Caching: the ability to include query string parameters as part of the object's cache key. Customers will have a switch to turn query strings 'on' or 'off'. When turned off, CloudFront's behavior will be the same as today - i.e., CloudFront will not pass the query string to the origin server nor include query string parameters as a part of the object's cache key. And when query strings are turned on, CloudFront will pass the full URL (including the query string) to the origin server and also use the full URL to uniquely identify an object in the cache.

      URL based configuration: the ability to configure cache behaviors based on URL path patterns. Each URL path pattern will include a set of cache behaviors associated with it. These cache behaviors include the target origin, a switch for query strings to be on/off, a list of trusted signers for private content, the viewer protocol policy, and the minTTL that CloudFront should apply for that URL path pattern. See the graphic at the end of this post for an example configuration.

      More new features

      In addition to these features, there are other things the CloudFront team has achieved to speed up delivery of content, but all customer will get these benefits by default without additional configuration. These performance optimizations are available for all types of content (static and dynamic) delivered via CloudFront. Specifically:

      Optimal TCP Windows. The TCP initcwnd has been increased for all CloudFront hosts to maximize the available bandwidth between the edge and the viewer. This is in addition to the existing optimizations of routing viewers to the edge location with lowest latency for that user, and also persistent connections with the clients.

      Persistent Connection to Origins. Connections are improved from CloudFront edge locations to the origins by maintaining long-lived persistent connections. This helps by reducing the connection set-up time from the edge to the origin for each new viewer. When the viewer is far away from the origin, this is even more helpful in minimizing total latency between the viewer and the origin.

      Selecting the best AWS region for Origin Fetch. When customers run their origins in AWS, we expect that our network paths from each CloudFront edge to the various AWS Regions will perform better with less packet loss given that we monitor and optimize these network paths for availability and performance. In addition, we have shown an optional configuration in the architecture diagram how developers can use Route 53’s LBR (Latency Based Routing) to run their origin servers in different AWS Regions. Each CloudFront edge location will then go to the “best” AWS Region for the origin fetch. And Route 53 already understands very well which CloudFront host is in which edge location (this is integration we’ve built between the two services). This helps improve performance even further.

      Amazon CloudFront is expanding it functionality and feature set at an incredible pace. I am particularly excited about these features that help customers deliver both static and dynamic content through one distribution. CloudFront stays true to its mission in making a Content Delivery Network dead simple to use, and now they also do this for dynamic content.

      For more details, see the CloudFront detail page and the posting on the AWS developer blog.


      Joe Panettieri (@joepanettieri) asked Red Hat OpenShift PaaS: Will Cloud Developers Climb Aboard? in a 5/13/2012 post to the TalkinCloud blog:

      imageWhen Red Hat (NYSE: RHT) recently announced its long-term strategy for OpenShift, I began to think about potential implications for cloud-focused application developers and emerging cloud consultants. Already, cloud developers are seeking to understand cloud platforms like OpenStack, CloudStack, Microsoft Windows Azure and VMware Cloud Foundry. Amid all that noise, can Red Hat attract developers to OpenShift? And equally important: Can cloud consultants explain OpenShift and its alternatives to business customers?

      So far, Red Hat is positioning OpenShift, a platform as a service (PaaS), mostly for enterprise customers and developers. There isn’t much — if any — chatter about OpenShift for SMB (small and midsize business) use.

      Red Hat unveiled OpenShift in May 2011. By April 2012, Red Hat open sourced OpenShift through a project called OpenShift Origin. And in May 2012, Red Hat offered updates regarding the OpenShift road map. That roadmap explains how OpenShift is built atop Red Hat’s core technologies. According to Red Hat:

      “Combining the core enterprise technologies that power OpenShift PaaS– including Red Hat Enterprise Linux, Red Hat Storage, JBoss Enterprise Middleware and OpenShift’s integrated programming languages, frameworks and developer tools – Red Hat plans to deliver the OpenShift cloud application platform available as a PaaS for enterprises in an open and hybrid cloud.”

      Potential OpenShift Opportunities

      No doubt, Red Hat will try to convince existing Red Hat Enterprise Linux ISVs (independent software vendors), JBoss integrators and other channel partners to embrace OpenShift. And it sounds like there will be three ways for enterprise customers to use OpenShift, including:

      1. As a service. A fee-based version of OpenShift.RedHat.com is expected to launch in late 2012.
      2. As a private PaaS offering. Where enterprises run OpenShift on their own.
      3. On a third-party cloud or via a third-party virtualization provider — though it’s unclear to me at this time which third parties might be options for Red Hat customers.
      Rival PaaS Offerings

      In some ways, OpenShift sounds most similar to VMware’s Cloud Foundry, another emerging PaaS platform. A safe guess:

      • Red Hat will likely assert that OpenShift coupled with Red Hat Enterprise Virtualization (RHEV) will offer a lower-cost, more open approach forward vs. VMware Cloud Foundry and vSphere virtualization.
      • VMware (NYSE: VMW), on the flip side, will likely assert that its virtualization software remains the most scalable, most reliable, easiest-to-manage foundation for cloud services.

      Meanwhile, Microsoft (NASDAQ: MSFT) continues to march forward with its own PaaS play — Windows Azure. Microsoft hasn’t said much about Windows Azure’s revenue base so far, and there have been rumors that Microsoft may rebrand Windows Azure amid a slow market start (personally, I doubt the rumors). [Emphasis added.]

      In recent weeks, a growing list of ISVs (independent software vendors) have launched applications in the Windows Azure cloud. One example is CA Technologies’ ARCserve, a backup and recovery software platform that started out as an on-premises solution. But in some cases, Microsoft is paying third-party ISVs to support Windows Azure, Talkin’ Cloud has confirmed with multiple sources. That could be a sign that Microsoft is struggling to make Azure a mainstream success.

      Talkin’ Cloud will seek an update during Microsoft Worldwide Partner Conference 2012 (WPC12, July 8-12, Toronto).

      PaaS vs. IaaS

      Elsewhere, some folks are comparing OpenShift, Cloud Foundry and Windows Azure to OpenStack and CloudStack. But that’s not exactly an apples-to-apples comparison.

      • OpenStack (originally promoted by Rackspace) and CloudStack (originally promoted by Citrix) are IaaS. Here, cloud providers typically offer virtual machines, raw block storage, firewalls and other basic network infrastructure.
      • OpenShift, Cloud Foundry and Windows Azure are PaaS. Here, cloud providers typically offer a solution stack (operating systems, databases and web serviers) to application developers.
      Time to Educate Your Customers

      Cloud developers certainly understand all the jargon above. But it’s a safe bet most CIOs (chief information officers) and corporate executives don’t know the differences between OpenShift, Cloud Foundry and other emerging PaaS options.

      That’s where cloud consultants and cloud integrators enter the picture. And so far, I don’t think Red Hat and its rivals have done enough to educate consultants and integrators about the cloud opportunities ahead.

      Read More About This Topic

      <Return to section navigation list>

      by Roger Jennings (--rj) (noreply@blogger.com) at May 15, 2012 09:59 AM

      Microsoft Codename “Data Transfer” and “Data Hub” Previews Don’t Appear Ready for BigData

      Or even MediumData, for that matter:

      Neither the Codename “Data Transfer” utility nor Codename “Data Hub” application CTPs would load 500,000 rows of a simple Excel worksheet saved as a 17-MB *.csv file to an SQL Azure table.

      The “Data Transfer” utility’s Choose a File to Transfer page states: “We support uploading of files smaller than 200 MB, right now,” but neither preview publishes a row count limit that I can find. “Data Hub” uses “Data Transfer” to upload data, so the maximum file size would apply to it, too.

      Both Windows Azure SaaS offerings handled a 100-row subset with aplomb, so the issue appears to be row count, not data structure.


      Update 5/15/2012 7:45 AM PDT to the 4/24/2012 update below: A member of the Microsoft Data Hub/Transfer team advised that the known erroneous row count and random upload failure issues have been fixed. I will retest uploads later this week and report my results here and to the team.

      image•• Update 5/5/2012 8:45 AM PDT: Max Uritsky (@max_data), Group Program Manager for Microsoft Azure Marketplace, responded to my How Can I Tell if a *.csv Upload Is Successful? thread in the Windows Azure Marketplace Forum, which complained about *.csv upload failures with large files:

      .csv upload in DataMarket is preview only. We will disable it in a mean time in order to reduce confusion. …

      I found no indication that *.csv file uploads were “preview only” during my tests. IMO, eliminating *.csv uploads is a bad idea because it will discourage ordinary contributors from participating in the DataMarket with freely download content. In this case, ordinary contributors would need to provide their own SQL Azure Web database at $9.95 per month. Other remaining content-provision options, such as Web services and OData feeds, aren’t practical for such contributors.

      I assume that Windows Azure DataMarket is using Codename “Data Hub”/”Data Transfer” to upload these files. If so, fixing the problem with Codename “Data Hub” should solve the problem with uploading *.csv files to DataMarket.

      Disabling a feature rather than fixing it isn’t my idea of a good policy.

      Note: The Codename “Data Transfer” team implemented “My Great Windows Azure Idea” to Enable Append or Replace Option for SQL Azure Uploads of 11/20/2011 on 2/6/2012.

      Links to related posts:


      • Update 5/4/2012 10:30 AM PDT: See details about the forcible disconnect from Fiddler2 at the end of this post. Also added five more On_Time_Performance_2011_MM.csv files to my SkyDrive account to complete the series for the year 2011.


      image_thumb15_thumbUpdate 5/2/2012 3:00 PM PDT: I was unable to upload additional ~500,000-row monthly On_Time_Performance_YYYY_MM.csv files to my free US Air Carrier Flight Delays, Monthly dataset on the public Windows Azure Marketplace Data market today. Forced disconnects similar to those I reported for Codename “Data Hub” occurred. Here’s a screen capture of the free public offer:

      image

      To subscribe to the data set, go the the Windows Azure Marketplace DataMarket landing page, create an account if you don’t have one, log in, and type OakLeaf in the Search the Marketplace text box to display the data and app offers:

      image

      Click the US Air Carrier Flight Delays, Monthly link to open the Offer page, and click the Sign Up button to open the eponymous page:

      image

      Mark the I have read and agree to … check box and click the Sign Up button to open the Thank You page:

      image

      This process adds a link to your Data list:

      image


      Update 4/24/2012 8:15 AM PDT: A member of the Microsoft Data Hub/Transfer team advised that the erroneous row count and random upload failure problems I reported for Codename “Data Transfer” were known issues and the team was working on them. I was unable to upload the ~500,000-row files with Codename “Data Hub”; see the added “Results with Codename “Data Hub” Uploads” section at the end of the post.


      Update 4/23/2012 10:00 AM PDT: Two members of Microsoft Data Hub/Transfer team reported that they could upload the large test file successfully. Added “Computer/Internet Connection Details” section below. Completed tests to determine maximum file size I can upload. The My Data page showed anomalous results but only the 200k row test actually failed on 4/23. See the Subsequent Events section.


      Background

      The Creating the Azure Blob Source Data section of my Using Data from Windows Azure Blobs with Apache Hadoop on Windows Azure CTP post of 4/6/2012 described the data set I wanted to distribute via a publicly accessible, free Windows Azure DataMarket dataset. The only differences between it and the tab-delimited *.txt files uploaded to blobs that served as the data source for an Apache Hive table were

      • Inclusion of column names in the first row
      • Addition of a formatted date field (Hive tables don’t have a native date or datetime datatype)
      • Field delimiter character (comma instead of tab)

      Following is a screen capture of the first 20 data rows of the 500,000-row On_Time_Performance_2012_1.csv table:

      imageClick images to display full-size screen captures.

      You can download sample On_Time_Performance_YYYY_MM.csv files from the OnTimePerformanceCSV folder of my Windows Live SkyDrive account. On_Time_Performance_2012_0.csv is the 100-row sample file described in the preceding section; On_Time_Performance_2012_1.csv has 486,133 data rows.

      Tab-delimited sample On_Time_Performance_YYYY_MM.txt files (without the first row of column names and formatted date) for use in creating blobs to serve as the data source for Hive databases are available from my Flight Data Files for Hadoop on Azure SkyDrive folder.

      Provision of the files through a private Azure DataMarket service was intended to supplement the SkyDrive downloads.

      Computer/Internet Connection Details:

      Intel 64-bit DQ45CB motherboard with Core 2 Quad CPU Q9950 2.83 GHz, 8 GB RAM, 750 GB RAID 1 discs, Windows 7 Premium SP1, IE 9.0.8112.16421.

      AT&T commercial DSL copper connection, Cayman router, 2.60 Mbps download, 0.42 Mbps upload after router reboot, 100-Mbps wired connection from Windows 2003 Server R&RA NAT.


      Uploading with Microsoft Codename “Data Hub”

      Codename “Data Hub” provides testers with up to four free SQL Azure 1-GB Web databases, so I created a connection to a new On_Time_Performance database:

      image

      I then specified the ~500,000-row On_Time_Performance2012_1.csv file for January 2012 as the data source and clicked Upload:

      image

      The site provided no indication of any activity, although my DSL router indicated data was being uploaded. After a few minutes, the server disconnected. Reloading the page showed no change in status.

      I then tried uploading the 100-row On_Time_Performance_2012_0.csv, which opened the following page after about 10 seconds:

      image

      I accepted the suggested data types and clicked Submit, which added the data to the table.


      Uploading with Microsoft Codename “Data Transfer”

      I created a new database in an existing OakLeaf SQL Azure instance because “Data Transfer” doesn’t provide free 1-GB Web databases. I repeated the above process with Codename “Data Transfer” but encountered a bug which prevented use of the # (and presumably other symbols) in the existing database access password:

      image

      Update 4/23/2012: A member of Microsoft’s Data Transfer team was able to reproduce the # symbol problem.

      Selecting On_Time_Performance_2012_1.csv to upload by clicking Analyze caused the app to hang in the Loading … condition:

      image

      Canceling the process and selecting 100-row On_Time_Performance_2012_0.csv resulted in the expected Update the Table Settings page appearing in about 10 seconds:

      image 

      Clicking Save resulted in a Submit Succeeded message.


      Conclusion

      Neither Codename “Data Hub” nor Codename “Data Transfer” appears to be ready for prime time. Hopefully, a fast refresh will solve the problem because users’ Codename “Data Hub” preview invitations are valid only for three weeks.


      Subsequent Events

      Members of the Microsoft Data Transfer/Data Hub team weren’t able to reproduce my problem on 4/22 and 4/23/2012. They could process the 486,133-row On_Time_Performance_2012_1.csv file without difficulty. To determine at what file size uploading problems occurred for me, I created files of 1,000, 10,000, 100,000, 150,000, and 200,000 data rows from On_Time_Performance_2012_1.csv. I’ve uploaded these files to the public OnTimePerformanceCSV folder of my Windows Live SkyDrive account.

      Results with Codename “Data Transfer” Uploads

      All files appeared to upload on Monday morning, 4/23/2012, but My Data showed incorrect Last Job Status data for all but the 10,000-row set. I used Codename “Data Transfer” instead of “Data Hub” to obtain Job Status data. Data below was refreshed about 15 minutes after completion of the the 2012_1.csv file; I failed to save the 1,000-row set:

      image

      The 100k file created 100,000 rows, as expected, 200k added no rows to the table, and a rerun of 2012_1 created the expected 486,133 rows:

      image

      Microsoft’s South Central US (San Antonio) data center hosts the e3895m7bbt database. It’s possible that problems affecting Windows Compute there on 4/19 and 4/20 (see my Flurry of Outages on 4/19/2012 for my Windows Azure Tables Demo in the South Central US Data Center post) spilled over to SQL Azure on 4/22/2012, but that’s unlikely. However, the unexpected results with the 200k table and anomalous Last Job Status data indicates an underlying problem. I’ll update this post as I obtain more information on the problem from Microsoft.

      Results with Codename “Data Hub” Uploads

      I was able to upload all test files (100, 1,000, 10,000, 100,000, 150,000 and 200,000 rows) but unable to upload the On_Time_Performance_2012_1.csv file to one of the four free databases with Codename “Data Hub” after three tries. The service forcibly disconnects after data upload completes, so data doesn’t transfer from the blob to the database table.

      So I used the data source I created with Codename “Data Transfer” as an external data source to publish the data. None of the data fields were indexed, which displayed the following error (in bold red type) in the “My Offerings” page’s Status/Review section:

      All queryable columns must be indexed: Not all queryable columns in table "On_Time_Performance_2012_1" are indexed. The columns that are not indexed are: "ArrDelayMinutes", "Carrier", "DayofMonth", "DepDelayMinutes", "Dest", "FlightDate", "Month", "Origin", "Year".

      Codename “Data Transfer” doesn’t offer an option to index specific columns so I added indexes on all fields except RowId, DayofMonth, Month and Year with SQL Server Management Studio and cleared the Queryable checkboxes for these fields on the My Offerings - Data Source page.

      Here’s Data Explorer’s Table View of part of the first few rows:

      image

      Check the Codename “Data Transfer” feedback page for my improvement suggestions:

      “Data Hub” doesn’t appear to have its own feedback page.


      • Update 5/4/2012: Fiddler2 returns the following message when the Windows Azure Marketplace DataMarket forcibly closes the connection:

      [Fiddler] ResendRequest() failed: Unable to write data to the transport connection: An existing connection was forcibly closed by the remote host. < An existing connection was forcibly closed by the remote host

      Here’s Fiddler’s Tunnel to HTTPS request:

      CONNECT publish.marketplace.windowsazure.com:443 HTTP/1.0
      User-Agent: Mozilla/5.0 (compatible; MSIE 9.0; Windows NT 6.1; WOW64; Trident/5.0; BOIE9;ENUS)
      Host: publish.marketplace.windowsazure.com:443
      Content-Length: 0
      Connection: Keep-Alive
      Pragma: no-cache

      A SSLv3-compatible ClientHello handshake was found. Fiddler extracted the parameters below.

      Major Version: 3
      Minor Version: 1
      Random: 4F A4 0D C2 93 73 E8 BF B8 1B 72 0C F9 18 9F 10 20 DB DC 69 CC 10 EA 23 03 52 EB D7 CC DE 9B A5
      SessionID: 39 40 00 00 43 4D 4A 4A C8 09 67 51 4A D9 C1 0C 36 7E 2D CB 12 DB B0 F5 49 03 81 57 86 B2 4D F9
      Ciphers:
          [002F]    TLS_RSA_AES_128_SHA
          [0035]    TLS_RSA_AES_256_SHA
          [0005]    SSL_RSA_WITH_RC4_128_SHA
          [000A]    SSL_RSA_WITH_3DES_EDE_SHA
          [C013]    TLS1_CK_ECDHE_RSA_WITH_AES_128_CBC_SHA
          [C014]    TLS1_CK_ECDHE_RSA_WITH_AES_256_CBC_SHA
          [C009]    TLS1_CK_ECDHE_ECDSA_WITH_AES_128_CBC_SHA
          [C00A]    TLS1_CK_ECDHE_ECDSA_WITH_AES_256_CBC_SHA
          [0032]    TLS_DHE_DSS_WITH_AES_128_SHA
          [0038]    TLS_DHE_DSS_WITH_AES_256_SHA
          [0013]    SSL_DHE_DSS_WITH_3DES_EDE_SHA
          [0004]    SSL_RSA_WITH_RC4_128_MD5

      Compression:
          [00]    NO_COMPRESSION

      Extensions:
          renegotiation_info    00
          server_name    publish.marketplace.windowsazure.com
          status_request    01 00 00 00 00
          elliptic_curves    00 04 00 17 00 18
          ec_point_formats    01 00

      and the HTTPS response:

      HTTP/1.0 200 Connection Established
      FiddlerGateway: Direct
      StartTime: 10:11:30.902
      Connection: close

      Encrypted HTTPS traffic flows through this CONNECT tunnel. HTTPS Decryption is enabled in Fiddler, so decrypted sessions running in this tunnel will be shown in the Web Sessions list.

      Secure Protocol: Tls
      Cipher: Aes128 128bits
      Hash Algorithm: Sha1 160bitsKey Exchange: RsaKeyX 2048bits

      == Server Certificate ==========
      [Subject]
        CN=publish.marketplace.windowsazure.com

      [Issuer]
        CN=Microsoft Secure Server Authority, DC=redmond, DC=corp, DC=microsoft, DC=com

      [Serial Number]
        7EDE070F0008000251F2

      [Not Before]
        11/15/2011 2:43:11 PM

      [Not After]
        11/14/2013 2:43:11 PM

      [Thumbprint]
        C7FC0219C14C8D274B6630BC1DEE0C3AFF757602

      The ResendRequest()’s HTTP GET request is as follows:

      GET https://publish.marketplace.windowsazure.com/favicon.ico HTTP/1.1
      Accept: */*
      Accept-Encoding: gzip, deflate
      User-Agent: Mozilla/5.0 (compatible; MSIE 9.0; Windows NT 6.1; WOW64; Trident/5.0; BOIE9;ENUS)
      Host: publish.marketplace.windowsazure.com
      Connection: Keep-Alive
      Cookie: l=en-US; _vis_opt_s=9%7C; _vis_opt_exp_79_combi=2; _vis_opt_exp_82_combi=2; _vis_opt_exp_90_combi=1%2C1; _vis_opt_exp_89_combi=5; _vis_opt_exp_107_combi=2; _vis_opt_exp_107_goal_1=1; _vis_opt_exp_108_combi=2; _vis_opt_exp_108_goal_1=1; _vis_opt_exp_114_combi=1; FedAuth=

      The authentication token is elided for brevity. Here’s the response without the document:

      HTTP/1.1 404 Not Found
      Content-Type: text/html
      Server: Microsoft-IIS/7.5
      X-Powered-By: ASP.NET
      Date: Fri, 04 May 2012 17:11:31 GMT
      Content-Length: 1245


      by Roger Jennings (--rj) (noreply@blogger.com) at May 15, 2012 08:41 AM

      May 14, 2012

      Cloud Musings (Kevin L Jackson)

      Gravatar

      FedRAMP PMO Releases First Set of 3PAOs

      Late today the FedRAMP Program Management Office released the first list of certified Third Party Assessment Organizations (3PAOs). These companies are accredited to perform initial and periodic assessment of cloud service provider (CSP) systems per FedRAMP requirements, provide evidence of compliance, and play an on-going role in ensuring CSPs meet requirements.  FedRAMP provisional authorizations must include an assessment by an accredited 3PAO to ensure a consistent assessment process. he initial set of 3PAOs announced today are (see http://www.gsa.gov/portal/content/131991):

      Organization POC Name POC Email
      COACT, Inc. Brian Pleffner bpleffner@coact.com
      Department of Transportation (DOT) Enterprise Service Center (ESC) Douglas Holland doug.holland@faa.gov
      Dynamics Research Corporation (DRC) Preston Gale pgale@drc.com
      J.D. Biggs and Associates, Inc. James Biggs james@jdbiggs.com
      Knowledge Consulting Group, Inc. Sherrie Nutzman sherrie.nutzman@knowledgecg.com
      Logyx LLC Robert Dumais rdumais@logyx.com
      Lunarline, Inc. Waylon Krush waylon.krush@lunarline.com
      SRA International, Inc. William Bell will_bell@sra.com
      Veris Group, LLC Douglas Greise dgreise@verisgroup.com

      In becoming a 3PAO, these companies successfully completed a NIST coordinated conformity assessment process. This conformity assessment process qualifies 3PAOs according to two requirements:
      • Independence and quality management in accordance with ISO standards
      • Technical competence through FISMA knowledge testing



      Bookmark and Share
      Cloud Musings on Forbes
      ( Thank you. If you enjoyed this article, get free updates by email or RSS - KLJ )


      by noreply@blogger.com (Kevin Jackson) at May 14, 2012 09:41 PM

      Cloud Developer Tips (Shlomo Swidler)

      Gravatar

      Poking Holes in CloudFront-Based Sites for Dynamic Content

      As of Februrary 2011 AWS S3 has been able to serve static websites, giving you superior availability for unchanging (or seldom-changing) content. But most websites today are not static; dynamic elements drive essential features such as personalized pages, targeted advertisements, and shopping carts. Today’s release from AWS CloudFront: Support for Dynamic Content alleviates some of the challenge of running dynamic websites. You can now configure a custom set of URL patterns to always be passed through to the origin server. This allows you to “poke holes” in the CDN cache for providing dynamic content.

      Some web sites, such as this one, appear to be static but are driven by dynamic code. WordPress renders each page on every request. Though excellent tools exist to provide caching for WordPress, these tools still require your web server to process WordPress’s PHP scripts. Heavy traffic or poor hosting choices can still overwhelm your web server.

      Poking Holes

       

      It’s relatively easy to configure your entire domain to be served from CloudFront. What do you need to think about when you poke holes in a CloudFront distribution? Here are two important items: admin pages and form actions.

      Admin pages

      The last thing you want is for your site’s control panel to be statically served. You need an accurate picture of the current situation in order to manage your site. In WordPress, this includes everything in the /wp-admin/* path as well as the /wp-login.php page.

      Form actions

      Your site most likely does something with the information people submit in forms – search with it, store it, or otherwise process it. If not, why collect it? In order to process the submitted information you need to handle it dynamically in your web application, and that means the submit action can’t lead to a static page. Make sure your form submission actions – such as search and feedback links – pass through to the webserver directly.

      A great technique for feedback forms is to use WuFoo, where you can visually construct forms and integrate them into your website by simple Javascipt. This means that your page can remain static – the Javascript code dynamically inserts the form, and WuFoo handles the processing, stops the spam, and sends you the results via email.

      When Content Isn’t So Dynamic

       

      Sometimes content changes infrequently – for example, your favicon probably changes rarely. Blog posts, once written, seldom change. Serving these items from a CDN is still an effective way to reduce load on your webserver and reduce latency for your users. But when things do change – such as updated images, additional comments, or new posts, how can you use CloudFront to serve the new content? How can you make sure CloudFront works well with your updated content?

      Object versioning

      A common technique used to enable updating static objects is called object versioning. This means adding a version number to the file name, and updating the link to the file when a new version is available. This technique also allows an entire set of resources to be versioned at once, when you create a versioned directory name to hold the resources.

      Object versioning works well with CloudFront. In fact, it is the recommended way to update static resources that change infrequently. The alternative method, invalidating objects, is more expensive and difficult to control.

      Combining the Above Techniques

       

      You can use a combination of the above techniques to create a low-latency service that caches sometimes-dynamic content. For example, a WordPress blog could be optimized by integrating these techniques into the WordPress engine, perhaps via a plugin. Here’s what you’d do:

      • Create a CloudFront distribution for the site, setting its custom origin to point to the webserver.
      • Poke holes in the distribution necessary for the admin, login, and forms pages.
      • Create new versions of pages, images, etc. when they change, and new versions of the pages that refer to them.

      Even though WordPress generates each page via PHP, this collection of techniques allows the pages to be served via CloudFront and also be updated when changes occur. I don’t know of a plugin that combines all these techniques, but I suspect the good folks at W3-EDGE, producers of the W3 Total Cache performance optimization framework I mentioned above, are already working on it.

      by shlomo at May 14, 2012 06:21 PM

      ReadWriteCloud

      Gravatar

      Google Prices its Cloud SQL Offering, Solidifies Cloud Database Market

      The cloud database market continues to solidify as Google puts a price tag on its Cloud SQL offering. With actual charges to begin on June 12th, the move finally gives developers a way to see what they'll be spending on Cloud SQL, but comparing Google's offering to Amazon, Microsoft and others might still be a bit tricky.

      Google's Cloud SQL is MySQL-based and is intended to be used with Google App Engine (GAE). Google's pricing structure is very simple, though not as comprehensive or as expandable as Amazon or others.

      Google has two billing plans: a package plan and a per-use plan. The package plan has four tiers, each of which includes a set amount of RAM, storage and I/O per day. For instance, Google charges $1.46 per day for the D1 tier, which has .5GB of RAM, 1GB of storage and 850,000 I/O requests. The top package (D8) includes 4GB of RAM, 10GB storage and 8 million I/O requests for $11.71 per day.

      The same instances are available on an on-demand basis, starting at $0.10 per hour, with storage and I/O extra.

      The cheapest package from Google, then, runs about $45 a month and the most expensive runs about $357. That doesn't count any overages for I/O or storage.

      Sizing Up Google's Pricing

      Trying to compare Google pricing with Amazon, Azure or databases offered with PaaS services such as Heroku and Engine Yard is tricky, at best. Heroku's database offerings start at $50 per month, but the specs for its database differ considerably from the other providers. For example, Heroku features data clips for developers, and the hstore extension for key/value data storage.

      Amazon's DB instances seem to be a bit more powerful than Google Cloud SQL instances, and Amazon has features that Google Cloud SQL doesn't. For instance, Amazon's Small DB instance has 1.7 GB of RAM and has the equivalent of a single CPU. You're also limited to Google App Engine supported languages, Python and Java.

      Developers can choose between 5GB and 1TB of storage (the max for Google is 10GB storage). The Small DB instance runs about $77 a month, if it's on-demand. But, choosing a one-year reserved instance brings that down to about $45 a month. The pricing, then, seems to line up for the "small" instances for Amazon RDS and Google Cloud SQL, but Google has fewer features and what looks to be less compute power.

      But if you're using GAE, then Cloud SQL is the natural choice - so it's nice to see Google finally getting this into developers' hands. If you're using GAE and Cloud SQL, we'd love to hear what you think.

      by Joe Brockmeier at May 14, 2012 05:58 PM

      Amazon Web Services

      Gravatar

      AWS Week in Review - May 7, 2012

      Let's take a quick look at what happened in AWS-land last week:

      Monday,  May 7
      Tuesday, May 8
      Wednesday, May 9
      Thursday, May 10
      Friday, May 11
      Sunday, May 13

      Stay tuned for another exciting week!

      -- Jeff;

      by AWS Evangelist at May 14, 2012 03:50 PM

      Amazon CloudFront - Support for Dynamic Content

      Introduction
      Amazon CloudFront's network of edge locations (currently 30, with more in the works) gives you the ability to distribute static and streaming content to your users at high speed with low latency.

      Today we are introducing a set of features that, taken together, allow you to use CloudFront to serve dynamic, personalized content more quickly.

      What is Dynamic Personalized Content?
      As you know, content on the web is identified by a URL, or Uniform Resource Locator such as http://media.amazonwebservices.com/blog/console_cw_est_charge_service_2.png . A URL like this always identifies a unique piece of content.

      A URL can also contain a query string. This takes the form of a question mark  ("?") and additional information that the server can use to personalize the request. Suppose that we had a server at www.example.com, and that can return information about a particular user by invoking a PHP script that accepts a user name as an argument, with URLs like http://www.example.com/userinfo.php?jeff or http://www.example.com/userinfo.php?tina.

      Up until now, CloudFront did not use the query string as part of the key that it uses to identify the data that it stores in its edge locations.

      We're changing that today, and you can now use CloudFront to speed access to your dynamic data at our current low rates, making your applications faster and more responsive, regardless of where your users are located.

      With this change (and the others that I'll tell you about in a minute), Amazon CloudFront will become an even better component of your global applications. We've put together a long list of optimizations that will each increase the performance of your application on their own, but will work even better when you use them in conjunction with other AWS services such as Route 53, Amazon S3, and Amazon EC2.

      Tell Me More
      Ok, so here's what we've done:

      Persistent TCP Connections - Establishing a TCP connection takes some time because each new connection requires a three-way handshake between the server and the client. Amazon CloudFront makes use of persistent connections to each origin for dynamic content. This obviates the connection setup time that would otherwise slow down each request. Reusing these "long-haul" connections back to the server can eliminate hundreds of milliseconds of connection setup time. The connection from the client to the CloudFront edge location is also kept open whenever possible.

      Support for Multiple Origins - You can now reference multiple origins (sources of content) from a single CloudFront distribution. This means that you could, for example, serve images from Amazon S3, dynamic content from EC2, and other content from third-party sites, all from a single domain name. Being able to serve your entire site from a single domain will simplify implementation, allow the use of more relative URLs within the application, and can even get you past some cross-site scripting limitations.

      Support for Query Strings - CloudFront now uses the query string as part of its cache key. This optional feature gives you the ability to cache content at the edge that is specific to a particular user, city (e.g. weather or traffic), and so forth. You can enable query string support for your entire website or for selected portions, as needed.

      Variable Time-To-Live (TTL) - In many cases, dynamic content is either not cacheable or cacheable for a very short period of time, perhaps just a few seconds. In the past, CloudFront's minimum TTL was 60 minutes since all content was considered static. The new minimum TTL value is 0 seconds. If you set the TTL for a particular origin to 0, CloudFront will still cache the content from that origin. It will then make a GET request with an If-Modified-Since header, thereby giving the origin a chance to signal that CloudFront can continue to use the cached content if it hasn't changed at the origin.

      Large TCP Window - We increased the initial size of CloudFront's TCP window to 10 back in February, but we didn't say anything at the time. This enhancement allows more data to be "in flight" across the wire at a given time, without the usual waiting time as the window grows from the older value of 2.

      API and Management Console Support - All of the features listed above are accessible from the CloudFront APIs and the CloudFront tab of the AWS Management Console. You can now use URL patterns to exercise fine-grained control over the caching and delivery rules for different parts of your site.

      Of course, all of CloudFront's existing static content delivery features will continue to work as expected. GET and HEAD requests, default root object, invalidation, private content, access logs, IAM integration, and delivery of objects compressed by the origin.

      Working Together
      Let's take a look at the ways that various AWS services work together to make delivery of static and dynamic content as fast, reliable, and efficient and possible (click on the diagram at right for an even better illustration):

      • From Application / Client to CloudFront - CloudFront’s request routing technology ensures that each client is connected to the nearest edge location as determined by latency measurements that CloudFront continuously takes from internet users around the world. Route 53 may be optionally used as a DNS service to create a CNAME from your custom domain name to your CloudFront distribution. Persistent connections expedite data transfer.
      • Within the CloudFront Edge Locations - Multiple levels of caching at each edge location speed access to the most frequently viewed content and reduce the need to go to your origin servers for cacheable content.
      • From Edge Location to Origin - The nature of dynamic content requires repeated back and forth calls to the origin server. CloudFront edge locations collapse multiple concurrent requests for the same object into a single request. They also maintain persistent connections to the origins (with the large window size). Connections to other parts of AWS are made over high-quality networks that are monitored by Amazon for both availability and performance. This monitoring has the beneficial side effect of keeping error rates low and window sizes high.
       

      Cache Behaviors
      In order to give you full control over query string support, TTL values, and origins you can now associate a set of Cache Behaviors with each of your CloudFront distributions. Each behavior includes the following elements:

      • Path Pattern - A pattern (e.g. "*.jpg") that identifies the content subject to this behavior.
      • Origin Identifier -The identifier for the origin where CloudFront should forward user requests that match this path pattern.
      • Query String - A flag to enable support for query string processing for URLs that match the path pattern.
      • Trusted Signers - Information to enable other AWS accounts to create signed URLs for this URL path pattern.
      • Protocol Policy - Either allow-all or https-only, also applied only to this path pattern.
      • MinTTL - The minimum time-to-live for content subject to this behavior.

      Tool Support
      Andy from CloudBerry Lab sent me a note to let me know that they have added dynamic content support to the newest free version of the CloudBerry Explorer for Amazon S3.  In Andy's words:

      I'd like to let you know that CloudBerry Explorer is ready to support new CloudFront features by the time of release.  We have added the ability to manage multiple origins for a distribution, configure cache behavior for each origin based on URL path patterns and configure CloudFront to include query string parameters.

      You can read more about this in their new blog post, How to configure CloudFront Dynamic Content with CloudBerry S3 Explorer .

      Andy also sent some screen shots to show us how it works. The first step is to specify the Origins and CNAMEs associated with the distribution:

      The next step is to specify the Path Patterns:

      With the Origins and Path Patterns established, the final step is to configure the Path Patterns:

      And Here You Go
      Together with CloudFront's cost-effectiveness (no minimum commits or long-term contracts), these features add up to a content distribution system that is fast, powerful, and easy to use.

      So, what do you think? What kinds of applications can you build with these powerful new features?

      -- Jeff;

      PS - Read more about this new feature in Werner's new post: Dynamic Content Support in Amazon CloudFront.

      by AWS Evangelist at May 14, 2012 07:38 AM

      All Things Distributed (Werner Vogels)

      Gravatar

      Dynamic Content Support in Amazon CloudFront

      In the past three and a half years, Amazon CloudFront has changed the content delivery landscape. It has demonstrated that a CDN does not have to be complex to use with expensive contracts, minimum commits, or upfront fees, such that you are forcibly locked into a single vendor for a long time. CloudFront is simple, fast and reliable with the usual pay-as-you-go model. With just one click you can enable content to be distributed to the customer with low latency and high-reliability.

      Today Amazon CloudFront has taken another major step forward in ease of use. It now supports delivery of entire websites containing both static objects and dynamic content. With these features CloudFront makes it as simple as possible for customers to use CloudFront to speed up delivery of their entire dynamic website running in Amazon EC2/ELB (or third-party origins), without needing to worry about which URLs should point to CloudFront and which ones should go directly to the origin.

      Dynamic Content Support

      Recall that last month the CloudFront team announced lowering the minTTL customers can set on their objects, down to as low as 0 seconds to support delivery of dynamic content. In addition to the TTLs, customers also need some other features to deliver dynamic websites through CloudFront. The first set of features that CloudFront is launching today include:

      Multiple Origin Servers: the ability to specify multiple origin servers, including a default origin, for a CloudFront download distribution. This is useful when customers want to use different origin servers for different types of content. For example, an Amazon S3 bucket can be used as the origin for static objects and an Amazon EC2 instance as the origin for dynamic content, all fronted by the same CloudFront distribution domain name. Of course non-AWS origins are also permitted.

      Query String based Caching: the ability to include query string parameters as part of the object's cache key. Customers will have a switch to turn query strings 'on' or 'off'. When turned off, CloudFront's behavior will be the same as today - i.e., CloudFront will not pass the query string to the origin server nor include query string parameters as a part of the object's cache key. And when query strings are turned on, CloudFront will pass the full URL (including the query string) to the origin server and also use the full URL to uniquely identify an object in the cache.

      URL based configuration: the ability to configure cache behaviors based on URL path patterns. Each URL path pattern will include a set of cache behaviors associated with it. These cache behaviors include the target origin, a switch for query strings to be on/off, a list of trusted signers for private content, the viewer protocol policy, and the minTTL that CloudFront should apply for that URL path pattern. See the graphic at the end of this post for an example configuration.

      More new features

      In addition to these features, there are other things the CloudFront team has achieved to speed up delivery of content, but all customer will get these benefits by default without additional configuration. These performance optimizations are available for all types of content (static and dynamic) delivered via CloudFront. Specifically:

      Optimal TCP Windows. The TCP initcwnd has been increased for all CloudFront hosts to maximize the available bandwidth between the edge and the viewer. This is in addition to the existing optimizations of routing viewers to the edge location with lowest latency for that user, and also persistent connections with the clients.

      Persistent Connection to Origins. Connections are improved from CloudFront edge locations to the origins by maintaining long-lived persistent connections. This helps by reducing the connection set-up time from the edge to the origin for each new viewer. When the viewer is far away from the origin, this is even more helpful in minimizing total latency between the viewer and the origin.

      Selecting the best AWS region for Origin Fetch. When customers run their origins in AWS, we expect that our network paths from each CloudFront edge to the various AWS Regions will perform better with less packet loss given that we monitor and optimize these network paths for availability and performance. In addition, we have shown an optional configuration in the architecture diagram how developers can use Route 53’s LBR (Latency Based Routing) to run their origin servers in different AWS Regions. Each CloudFront edge location will then go to the “best” AWS Region for the origin fetch. And Route 53 already understands very well which CloudFront host is in which edge location (this is integration we’ve built between the two services). This helps improve performance even further.

      Amazon CloudFront is expanding it functionality and feature set at an incredible pace. I am particularly excited about these features that help customers deliver both static and dynamic content through one distribution. CloudFront stays true to its mission in making a Content Delivery Network dead simple to use, and now they also do this for dynamic content.

      For more details, see the CloudFront detail page and the posting on the AWS developer blog.

      May 14, 2012 05:01 AM

      May 13, 2012

      OakLeaf Systems

      Gravatar

      Windows Azure and Cloud Computing Posts for 5/10/2012+

      A compendium of Windows Azure, Service Bus, EAI & EDI, Access Control, Connect, SQL Azure Database, and other cloud-computing articles. image222

      image433

      Note: This post is updated daily or more frequently, depending on the availability of new articles in the following sections:


      Azure Blob, Drive, Table, Queue and Hadoop Services

      Gaurav Mantri (@gmantri) continued his series with Comparing Windows Azure Blob Storage and Amazon Simple Storage Service (S3)–Part II on 5/11/2012:

      imageIn part I of this blog post, we started comparing Windows Azure Blob Storage and Amazon Simple Storage Service (S3). We covered basic concepts and compared pricing and features of Blob Container and Buckets. You can read that blog post here: http://gauravmantri.com/2012/05/09/comparing-windows-azure-blob-storage-and-amazon-simple-storage-service-s3part-i/

      In this post, we’re going to compare Blobs and Objects from features point of view.

      Like the previous post, we’re going to refer Windows Azure Blob Storage as WABS and Amazon Simple Storage Service as AS3 in the rest of this blog post for the sake of brevity.

      Concepts

      Before we talk about these two services in greater detail, I think it is important to get some concepts clear about blobs and objects. This section is taken verbatim from the previous post.

      Blobs and Objects: Simply put blobs (in WABS) and objects (in AS3) are the files in your cloud file system. They go into blob containers and buckets respectively.

      A few comments about blobs and objects:

      • There is no limit on the number of blobs and objects you can store. While AS3 does not tell you the maximum storage capacity allocated for you, the total number of blobs in WABS is restricted by the size of your storage account (100 TB currently).
      • The maximum size of an object you can store in AS3 is 5 TB where as the maximum size of a blob in WABS is 1 TB.
      • In WABS, there are two kinds of blobs – Block Blobs and Page Blobs. Block Blobs are suitable for streaming payload (e.g images, videos, documents etc.) and can be of a maximum of 200 GB in size. Page Blobs are suitable for random read/write payload and can be of a maximum of 1 TB in size. A common use case of a page blob is a VHD mounted as a drive in a Windows Azure role. In AS3, there is no such distinction. To learn more about block blobs and page blobs, click here.
      • Both systems are quite feature rich as far as operations on blobs and objects are concerned. You can copy, upload, download and perform other operations on them.
      • While both systems allow you to protect your content from unauthorized access, the ACL mechanism is much granular in AS3 where you can set custom ACL on each object in a bucket. In WABS, it is at a blob container level. …

      Gaurav continues with a detailed description of the differences between WABS and AS3. Read more.

      Full disclosure: I have received and use gratis copies of Cerebrata’s Windows Azure tools. Free licenses for these tools are offered by Red Gate Software to all Windows Azure MVPs and Insiders.


      Alejandro Jezierski (@alexjota) described Hadoop on Azure and importing data from the Windows Azure Marketplace in a 5/10/2012 post:

      imageIn a previous post I briefly introduced Hadoop and described a brief overview of its components. In this post I’ll put my money where my mouth is and actually do stuff with Hadoop on Azure. Again, one step at a time.

      image_thumb15_thumb_thumb[1]So I went to the market and found some free demographics data available from the UN to mess around with and imported it to my Hadoop on Azure cluster via the portal. Before you blindly go about importing data, you can first sample the goods. In the Windows Azure Marketplace you can subscribe to published data, and build a query of your interest. The most important thing here is that you need to take note of your primary account key and the query url.

      Take note of the query and passkey. Click on Show.

      Sample the goods, build your query.

      imageOnce we have the query we want, we need to go to our Hadoop on Azure portal, select Manage Cluster and then select DataMarket. Here we will have to input our user name (from your email), the passkey obtained earlier, the query url you obtained as well and the name of the Hive table so we can access the data after the import is done. Note: I’ve replaced the encoded space and quotation marks to avoid Bad Request errors. This happened to me because I copied and pasted the query right out from the marketplace. It took a couple of tries until I figured it out, oh well. Run the query by selecting Import Data.

      Now we can go to the Hive interactive console and take a look at the results. We can type the show tables command for a list of the tables, and make sure ours is there.

      Take into consideration that, although this looks like a table, it has columns like a table, and we can query the data as if it were stored in the table, it’s not. When we create tables in HiveQL, partition them and load data, HDFS files are actually created and stored (and replicated and distributed through the nodes).

      image_thumb3_thumbNow we can go on and type our HiveQL query. Remember what’s happening under the hood. This query is creating and executing a MapReduce job. I’m also a newbie in the MapReduce world, so I trust Gert Drapers when he says a simple join in HiveQL is equivalent to writing a much more complex bunch of code, and that Facebook’s jobs are mostly written in HiveQL. That means something, doesn’t it?

      So we’ve executed our simple HiveQL query and seen the results. We can always go back to the job history for that query, too (you can’t miss that big orange button in the main page).

      Job History.

      So, we’ve imported data from the marketplace and ran a simple HiveQL query. In a future post we can go through the samples that are included when you setup your cluster and mess around with the Javascript console. Refer to the previous post for additional links and resources.


      Alejandro Jezierski (@alexjota) posted Big Data, Hadoop on Azure and the elephant in the room on 5/8/2012:

      imageSeriously. There’s an elephant in the room, so I’ve no choice but to talk about it. I’m new to Big Data and newer to Hadoop on Azure, so this post (and future ones as well) will serve as an introduction to the underlying concepts of big data and my experience on using Hadoop on Azure, one step at a time.

      Big Data

      So we generate massive amounts of data. Massive. Structured, not structured, from devices, from sensors, feeds, tweets, blogs, everything we do in our daily lives generate data at some point. What do we do with it besides store it? Ignore it? Throw it away? We could. But data is there for a reason. We can extract valuable information from it. We can discover new business insights, interesting patterns emerge, and most important we could save lives… so yes, it’s a big deal. Ok, so we’ll leave the processing to multi-million dollar companies, they can afford it, right? One misconception is that we need all this massive state of the art infrastructure to be able to handle big data. We can setup nodes on affordable, commodity hardware, and achieve the same results. Nice. But I still need to maintain all these boxes, and they WILL fail eventually…

      imageHadoop is a scalable, hi fault tolerant open source MapReduce solution. It runs on commodity hardware, so there is an economic advantage to it.

      The main components of Hadoop are illustrated in the following diagram.

      HDFS: Hadoop Distributed File System. It’s the storage mechanism used by Hadoop applications. Amongst other things, it stores replicas of data blocks on the nodes of your cluster, aid availability, reliability, performance, etc.

      Map Reduce. A programming framework that allows to create mappers and reducers. The framework will construct a set of jobs, hand them over to the nodes for processing, and keep track of them. The map operation let’s you take the processing to where the data is stored in the distributed file system. The reduce operation summarizes the results from the mappers.

      Hive: it provides a few things, such as the possibility to create a structure for data through the use of tables. It also defines a SQL oriented language (QL, or Hive QL), in other words, MapReduce for mortals. The magic behind this is that a Hive query can be translated to a MapReduce job, and present the results back to the user. The need for Hive appeared because creating a relatively simple query in plain MapReduce jobs resulted in a cumbersome coding experience.

      Sqoop: Bridge between the Hadoop and the relational world. Because we also live in a relational world, right? We can import data from SQL Server, let Sqoop store the data in HDFS, and make the data available for our MapReduce jobs as well.

      Hadoop on Azure is Microsoft’s Hadoop distribution that runs on Windows, plus a hosting environment on Windows Azure. I recently got invited to use the CTP version of Hadoop on Azure (I did ask for an invitation a few weeks ago) and started to get familiar with its features. The huge benefit to this is that I don’t need to maintain all those nodes, I have my own cluster now, and I’m ready to handle massive amounts of data. Tada!

      In future posts I’ll be showing how to execute a simple MapReduce job, or how to get data from the Windows Azure Marketplace and query the data using Hive. The following links lead you to useful resources if you are getting started with Hadoop on Azure.

      Big Data, Big deal, video by Gert Drapers

      Introduction to Hadoop on Azure, video by Wenming Ye

      Hadoop on Azure portal, get invited!

      Apache Hadoop

      <Return to section navigation list>

      SQL Azure Database, Federations and Reporting

      Nathan Totten (@ntotten) and Nick Harris (@cloudnick) produced CloudCover Episode 80 - Getting Started with SQL Azure Data Sync Preview on 5/11/2012:

      Join Nate and Nick each week as they cover Windows Azure. You can follow and interact with the show at @CloudCoverShow.

      imageIn this episode, we are joined by Cory Fowler and Scott Klein — Windows Azure Technical Evangelists — who demonstrate how to get started with the SQL Azure Data Sync Preview.

      In the News:

      In the Tip of the Week, we discuss the Cloud Ninja Metering Block - an extensible and reusable software component designed to assist software developers with the metering of tenant resource usage in a multi-tenant solution on the Windows Azure platform.

      Learn more about the SQL Azure Data Sync Preview FAQ here.


      <Return to section navigation list>

      MarketPlace DataMarket, Social Analytics, Big Data and OData

      My (@rogerjenn) Creating An Incremental SQL Azure Data Source for OakLeaf’s U.S. Air Carrier Flight Delays Dataset post of 5/8/2012 (updated 5/11/2012) begins:

      image• Updated 5/11/2012 with a correction regarding free databases for Windows Azure Marketplace DataMarket datasets.

      • Background
      • Creating the SQL Azure On_Time_Performance Table
      • Creating an On-Premises SQL Server Clone Table
      • Importing *.csv Data with the BULK IMPORT Command
      • Uploading Data to the SQL Azure Table with SQLAzureMW
      • Calculating the Size of the SQL Azure Database and Checking for Upload Errors
      • Conclusion
      Background

      image_thumb15_thumb_thumb[1]My initial U.S. Air Carrier Flight Delays, Monthly dataset for the Windows Azure Marketplace DataMarket, which has been disabled, was intended to incorporate individual tables for each month of the years 1987 through 2012 (and later.) I planned to compare performance of datasets and Windows Azure blob storage as persistent data sources for Apache Hive tables created with the new Apache Hadoop on Windows Azure feature.

      imageI used Microsoft Codename “Data Transfer” to create the first two of these SQL Azure tables, On_Time_Performance_2012_1 and On_Time_Performance_2012_2, from corresponding Excel On_Time_Performance_2012_1.csv and On_Time_Performance_2012_2.csv files in early May 2012. For more information about these files and the original U.S. Air Carrier Flight Delays, Monthly dataset see my Two Months of U.S. Air Carrier Flight Delay Data Available on the Windows Azure Marketplace DataMarket post of 5/4/2012.

      Subsequently, I discovered that the Windows Azure Marketplace Publishing Portal had problems uploading the large (~500,000 rows, ~15 MB) On_Time_Performance_YYYY_MM.csv files. I was advised by Microsoft’s Group Program Manager for the DataMarket that the *.csv upload feature would be disabled to “prevent confusion.” For more information about this issue, see my Microsoft Codename “Data Transfer” and “Data Hub” Previews Don’t Appear Ready for BigData post updated 5/5/2012.

      A further complication was the suspicion that editing the current data source to include each additional table would require a review by a DataMarket proctor. An early edit of one character in a description field had caused my dataset to be offline for a couple of days.

      A workaround for the preceding two problems is to create an on-premises clone of the SQL Azure table with a RowID identity column and recreate the SQL Azure table without the identity property on the RowID column. Doing this permits using a BULK INSERT instruction to import new rows from On_Time_Peformance_YYYY_MM.csv files to the local SQL Server 2012 table and then use George Huey’s SQL Azure Migration Wizard (SQLMW) v3.8.7 or later to append new data to a single On_Time_Performance SQL Azure table. Managing primary key identity values of an on-premises SQL Server table is safer and easier than with SQL Azure.

      The downside of this solution is that maintaining access to the 1-GB SQL Azure Web database will require paying at least US$9.99 per month plus outbound bandwidth charges after your free trial expires. Microsoft provides up to four free SQL Azure 1-GB databases when you specify a new database on the Codename “Data Hub” Publishing Portal’s Connect page.

      This post describes the process and T-SQL instructions for creating and managing the on-premises SQL Server [Express] 2012 databases, as well as incrementally uploading new data to the SQL Azure database.

      My new US Air Carrier Flight Delays dataset on the Windows Azure Marketplace DataMarket and Microsoft Codename “Data Hub” have been updated to the new data source. The “Monthly” suffix has been removed.


      Vitek Karas described OData V3 demo services in a 5/11/2012 post to the Odata.org blog:

      imageIn April we shipped the OData version 3 along with the 5.0 release of WCF Data Services. To make it easier for you discover the new features and for us to show you some of them, we’re now making available the OData demo services which use the OData V3 protocol.

      The new services are hosted side by side with old demo services which we didn’t change. The new demo services are using the WCF Data Services 5.0.1-rc release currently, and we will update them to the newer version once it becomes available.

      There are 3 demo services:

      The read-only Demo service, which has an updated model to use some V3 features as described below, is hosted here:

      http://services.odata.org/V3/OData/OData.svc/

      The read-write Demo service, which is the exact same model as the read-only Demo service including new V3 features, is hosted here:

      http://services.odata.org/V3/(S(readwrite))/OData/OData.svc/

      The read-only Northwind service, which is the exact same model as the existing Northwind service, is hosted here:

      http://services.odata.org/V3/Northwind/Northwind.svc/

      Here are the V3 features we’ve enabled so far.

      Actions

      The Product type has a Discount action on it which takes a single parameter discountPercentage of type Edm.Int32. The action takes the Price of the product it’s applied to and decreases it by the percentage specified by the parameter. For example:

      GET http://services.odata.org/V3/(S(plcxuejnllfvrrecpvqbehxz))/OData/OData.svc/Products(1)

      Will return a product with Price: 3.5

      POST http://services.odata.org/V3/(S(plcxuejnllfvrrecpvqbehxz))/OData/OData.svc/Products(1)/Discount HTTP/1.1
      Content-Type: application/json;odata=verbose
      { "discountPercentage": 25 }

      The response should be 204 No Content.

      And now again

      GET http://services.odata.org/V3/(S(plcxuejnllfvrrecpvqbehxz))/OData/OData.svc/Products(1)

      Returns a product with Price: 2.625

      Spatial

      The Supplier type has a property Location which is of type Edm.GeographyPoint. You can see it here:

      http://services.odata.org/V3/OData/OData.svc/Suppliers(0)

      Any and All

      The service now supports any and all operators. For example:

      http://services.odata.org/V3/OData/OData.svc/Categories?$filter=Products/any(p: p/Rating ge 4)

      Inheritance support

      You can now address properties on derived types. The demo service doesn’t have a sample property like that yet, but you can try the new URL syntax with type cast anyway:

      http://services.odata.org/V3/OData/OData.svc/Products/ODataDemo.Product

      Patch support

      You can send PATCH requests instead of MERGE. The behavior is identical otherwise.

      Prefer header support

      You can specify a Prefer header in create or update requests and ask the server to either omit or include the payload. For example:

      PATCH http://services.odata.org/V3/(S(plcxuejnllfvrrecpvqbehxz))/OData/OData.svc/Products(1) HTTP/1.1
      Accept: application/json;odata=verbose
      Content-Type: application/json;odata=verbose
      Prefer: return-content
      DataServiceVersion: 3.0;

      { "Price": "3.5" }

      Responds with (trimmed for readability):

      HTTP/1.1 200 OK
      Content-Type: application/json;odata=verbose;charset=utf-8
      Preference-Applied: return-content
      DataServiceVersion: 3.0;
      {"d":{
      "__metadata":{…},

      "ID":1,
      "Name":"Milk",
      "Description":"Low fat milk",
      "ReleaseDate":"1995-10-01T00:00:00",
      "DiscontinuedDate":null,
      "Rating":3,
      "Price":"3.5"
      }}

      Association links

      Each navigation link can now also specify an association link which is the URL to manipulate the association with. For example this is a part of the ATOM payload for ~/Product(1):

      <link
      rel="http://schemas.microsoft.com/ado/2007/08/dataservices/related/Category"
      type="application/atom+xml;type=entry"
      title="Category"
      href="Products(1)/Category" />
      <link
      rel="http://schemas.microsoft.com/ado/2007/08/dataservices/relatedlinks/Category"
      type="application/xml"
      title="Category"
      href="Products(1)/$links/Category" />

      We are working on adding more V3 features to the Demo service and we’ll be updating the service as we have them available.


      The Datanami Staff (@datanami) posted Chevron Drills Down Big Data Assets on 5/10/2012:

      imageThe world isn’t running out of oil and natural gas, it’s running out of easy supplies of it. For energy companies, including Chevron, this means drilling ever-deeper and in more remote locations for deposits of precious fossil fuels—all tasks that require advanced big data tools and technologies.

      According to Paul Siegele, president of Energy Technology at Chevron, the data involved in oil and gas exploration is staggering. However, without the technologies his team provides, which includes distributed sensors, high-speed communications, massive data-mining operations and remote drilling operations data management, the tasks of finding and exploiting new natural resources would be nearly impossible.

      According to Technology Review’s Jessica Leber, programs that invoke the “digital oilfield” approach to the new era of oil and gas will play a huge role in the future of energy companies. Leber says, “ The ones that are most successful at operating remotely and using data wisely will claim big rewards. Chevron cites industrywide estimates suggesting 8 percent higher production rates and 6 percent higher overall recovery from a ‘fully optimized’ digital oil field.”

      As the Leber cites, despite advancing renewable technologies, the International Energy Agency projects that global oil demand will still be growing by 2035 as more people use cars. And, as extraction becomes more difficult, almost $20 trillion in investments will be needed to satisfy these future needs.

      Chevron is currently deploying up to eight global "mission control" centers as part of its digital program. Each is focused on a particular goal, such as using real-time data to make collaborative decisions in drilling operations, or managing wells and imaging reservoirs for higher production yields. The purpose is to improve performance at more than 40 of its biggest energy developments. The company estimates that these centers will help it save $1 billion a year.

      Chevron's internal IT traffic alone exceeds 1.5 terabytes a day. Leber says that to keep up with this and new digital oil field projects, software tools are being developed at big oil contracting companies, such as Halliburton and Schlumberger, and big IT providers including Microsoft and IBM. (Emphasis added.)

      Related Stories


      <Return to section navigation list>

      Windows Azure Service Bus, Access Control, Identity and Workflow

      imageNo significant articles today.


      <Return to section navigation list>

      Windows Azure VM Role, Virtual Network, Connect, RDP and CDN

      imageNo significant articles today.


      <Return to section navigation list>

      Live Windows Azure Apps, APIs, Tools and Test Harnesses

      Joe Brinkman (@jbrinkman) posted Getting Started with Windows Azure and DotNetNuke on 5/12/2012:

      imageEvery week it seems more and more people are asking me how they can run DotNetNuke on Windows Azure. Last year David Rodriguez released the DotNetNuke Azure Accelerator which aims to simplify the process of installing DotNetNuke on Windows Azure. It was a great alternative to manually deploying DotNetNuke but it required the user to know how to use the Windows Azure Management Portal for setting up their Azure account. The original version of the accelerator also included the DotNetNuke installation package within the download. This meant that the accelerator was closely tied to the DotNetNuke version and had to be updated with every DotNetNuke release.

      imageA few weeks ago, David uploaded a 6.2 beta version of the accelerator. This release resolves a number of outstanding issues with the accelerator and really takes it to a whole new level. With the 6.2 release users won’t have to mess with the Windows Azure Management Portal. All of the tasks for configuring and deploying the Azure resources are completely automated. The Accelerator presents a wizard which walks you through all of the steps for provisioning your compute, storage and database accounts on Azure. It even makes it possible to configure your account for RDP access and Windows Azure Connect access. These updates will make installing DotNetNuke on Windows Azure much simpler.

      In addition to the simplified installation, the 6.2 accelerator also removed the tight coupling between the accelerator and DotNetNuke releases. The accelerator now queries DotNetNuke.com to find out the current version of DotNetNuke and has Windows Azure automatically download and install this version from CodePlex. As a result the accelerator has shrunk from a 66Mb download to a 4Mb package. This speeds up the deployment process as well since you don’t have to re-upload DotNetNuke to Windows Azure over your own network but can instead take advantage of Microsoft’s bandwidth availability between the CodePlex and Windows Azure data centers. While not available in the Beta release, we are working to have all the infrastructure pieces in place so that you will even be able to choose which version of DotNetNuke to install from within the accelerator. As new versions of DotNetNuke are released, the accelerator will automatically update it’s list giving you access to the latest versions of DotNetNuke without any changes to the accelerator.

      I am really excited about all of these changes and have been going back and forth with David on even more ways to enhance the accelerator to make it even easier and more powerful. I have created a short video which walks through the installation process and as you can see it really is very easy to get started with Windows Azure and DotNetNuke.


      David Gristwood posted Real World Windows Azure: LinkFresh Cloud QA – An application that helps businesses deliver safe, fresh, high quality food on 5/11/2012:

      One of the best things about my work with Windows Azure is when a system actually goes live, so our team are going to be doing more blogs about people building on the Windows Azure platform, so you can what sorts of applications they built, why the chose the cloud and Windows Azure, and what they learnt in the process.

      The blogs will all be on our partner blog under the “Real World Windows Azure” tag, and I will post some of the projects on my blog, such as this first one:

      LINKFresh Cloud QA by Anglia Business Solutions.

      clip_image001Fresh food producers are subject to stringent food safety and quality standards, so managing and monitoring this requires a flexible and powerful mobile solution. LINKFresh Cloud Quality Assurance, part of the Anglia Business Solutions' LINKFresh Cloud platform, delivers a mobile device application which offers a complete Business Management solution for the food supply chain industry. The application runs on Windows Azure and captures produce information, including photo capture, signatures and geo location and stores it safely and securely in the cloud for processing.

      Richard Jones, Anglia’s Director of Technology, shared their reasoning for building this application on the cloud and the benefits that it brings to their customers “Reaching a large volume of mobile users stretching across the globe always presents challenges for support, infrastructure and rollout. Using cloud technology simply made sense for us; we didn’t want the expense of building a backend server environment for each territory we wish to deploy in. The cloud means that we can switch on and off IT infrastructure as and when we require. This makes a great customer experience as the end user is always presented with a fast and useable App. That works reliably all the time. Being able to provide a unparalleled level of service to mobile users; makes our solution scale and keeps customers enthusiastic and loyal to our product offering.”

      He has also commented on how the company made some if its key technical decisions “Windows Azure takes the pain out of deploying large scale systems. The ability to use familiar development tools such as Visual Studio to manage and deploy a cloud solution is truly remarkable. Microsoft's ability to ramp up the number of instances and servers we deploy on makes reaching a large geography of users very straight-forward; we are deploying to a large mobile workforce and Azure fits perfectly with our requirement

      clip_image002     clip_image003

      David Gristwood, Windows Azure architect with Microsoft, noted that “This is a really innovative use of Windows Azure. The peak loads that there system has to deal with during busy periods is a perfect match for Windows Azure’s elasticity, which can scale up to cope with high throughout and traffic, then scale back down in quiet periods to reduce the running costs, which make for a really compelling story. Wrapping up their programming logic in Windows Azure web roles reduces the amount of programming effort required to manage this elasticity. Mobile scenarios are very common with Windows Azure, because they enable a wide range of devices to connect to Windows Azure through REST based webs service calls, to upload data and get the very latest information, whilst on the move, or in this case, working in the middle of a field

      Fast Facts
        • The LINKFresh Cloud Quality Assurance runs on Windows Azure and uses SQL Azure and Windows Azure storage. The mobile client runs on Windows Phone, Android and iOS
        • To cope with peak time demands, the system has been designed to handle hundreds of GB per day from thousands of workers in fields, farms and warehouses
        • The single code base used for this solution has simplified the development and deployment, and Windows Azure gives Anglia resilience, high availability and international reach without the up-front capital expenditure.
        • Anglia Business Solutions was founded in 1981 and is a gold certified partner based in Cambridge, UK, and was Dynamics partner of the year 2010
      For more information

      Wely Lau (@wely_live) continued his series with An Introduction to Windows Azure (Part 2) on 5/10/2012:

      imageThis is the second article of a two-part introduction to Windows Azure. In Part 1, I discussed the Windows Azure data centers and examined the core services that Windows Azure offers. In this article, I will explore additional services available as part of Windows Azure which enable customers to build richer, more powerful applications.

      Additional Services

      1. Building Block Services

      image‘Building block services’ were previously branded ‘Windows Azure AppFabric’. The main objective of building block services is to enable developers to build connected applications. The three services under this category are:

      (i) Caching Service

      Generally, accessing RAM is much faster than accessing disk, including storage and databases. For that reason, Microsoft have developed an in-memory and distributed caching service to deliver low latency, high-performance access, namely Windows Server AppFabric Caching. However, there are some activities, such as installing and managing, and some hardware requirements like investing in clustered servers, which have to be handled by the end-user.

      Windows Azure Caching Service is a self-managed, yet distributed, in-memory caching service built on top of the Windows Server AppFabric Caching Service. Developers will no longer have to install and manage the Caching Service / Clusters. All they need to do is to create a namespace, specify the region, and define the Cache Size. Everything will get provisioned automatically in just a few minutes.

      Creating new Windows Azure Caching Service

      Additionally, Azure Caching Service comes along with a .NET client library and session providers for ASP.NET, which allow the developer to quickly use them in the application.

      (ii) Access Control Service

      Third Party Authentication

      With the trend for federated identity / authentication becoming increasingly popular, many applications have relied on authentication from third party identity providers (IdPs) such as Live ID, Yahoo ID, Google ID, and Facebook.

      One of the challenges developers face when dealing with different IdPs is that they use different standard protocols (OAuth, WS-Trust, WS-Federation) and web tokens (SAML 1.1, SAML 2.0, SWT).

      Multiple ID Authentication

      Access Control Service (ACS) allows application users to authenticate using multiple IdPs. Instead of dealing with different IdPs individually, developers just need to deal with ACS and let it take care of the rest.

      AppFabric Access Control Services

      (iii) Service Bus

      Windows Azure’s Service Bus allows secure messaging and connectivity across multiple network hierarchies. It enables hybrid model scenarios, such as connecting cloud applications with on-premise systems. The Service Bus allows applications running on Windows Azure to call back to on-premise applications located behind firewalls and NATs.

      Service Bus Diagram

      Migrating from an on-premise Windows Communication Foundation (WCF) framework to the Service Bus is trivial as they use a similar programming approach.

      2. Data Services

      Data Services consists of SQL Azure Reporting and SQL Azure Data Sync, both of which are still currently available as Community Technology Previews (CTP).

      (i) SQL Azure Reporting

      SQL Azure Reporting aims to provide developers with a service similar to that of the current SQL Server Reporting Service (SSRS), with the advantages of being in the cloud. Developers are still able to use familiar tools such as SQL Server Business Intelligence Development Studio. Migrating on-premise reports is also easy as SQL Azure Reporting is essentially built on top of SSRS architecture.

      (ii) SQL Azure Data Sync

      SQL Azure Data Sync is a cloud-based data synchronization service built on top of theMicrosoft Sync Framework. It enables synchronization between a cloud database and another cloud database, or with an on-premise database.

      SQL Azure Data Sync

      (from Windows Azure Bootcamp)

      3. Networking

      Three networking services are available today:

      (i) Windows Azure CDN

      The Content Delivery Network (CDN) caches static content such as video, images, JavaScript, and CSS at the closest node to users. By doing so, it improves performance and provides the best user experience. There are currently 24 nodes available globally.

      Windows Azure CDN Locations

      (ii) Windows Azure Traffic Manager

      Traffic Manager is designed to enable high performance and high availability of web applications, by providing load-balancing across multiple hosted services in the six available data centers. In its current CTP guise, developers can select one of the following rules:

      • Performance – detects the location of the user traffic and routes it to the best online hosted service based on network performance.
      • Failover – based on an ordered list of hosted services, traffic is routed to the online service highest on the list.
      • Round Robin – equally distributes traffic to all hosted services.
      (iii) Windows Azure Connect

      Windows Azure Connect supports secure network connectivity between on-premise resources and the cloud by establishing a virtual network environment between them. With Windows Azure Connect, cloud applications appear to reside on the same network environment as on-premise applications.

      Windows Azure Connect

      (from the Windows Azure Platform Training Kit)

      Windows Azure Connect enables scenarios such as:

      • Using an on-premise SMTP Server from a cloud application.
      • Migrating enterprise apps which require an on-premise SQL Server to Windows Azure.
      • Domain-join a cloud application running in Azure to an Active Directory.
      4. Windows Azure Marketplace

      Windows Azure Marketplace is a centralized online market where developers are able to easily sell their applications or datasets.

      (i) Marketplace for Data

      Windows Azure Marketplace for Data is an information marketplace allowing ISVs to provide datasets (either free or paid) on any platform, and available to the global market. For example, Average House Prices, Borough provides annual and quarterly house prices based on Land Registry data in the UK. Developers can then subscribe and utilize this dataset to develop their application.

      (ii) Marketplace for Applications

      Windows Azure Market Place for Applications enables developers to publish and sell their applications. Many, if not all of these applications are SAAS applications built on Windows Azure. Applications submitted to the Marketplace must meet a set of criteria.

      Conclusion

      To conclude, we have examined the huge investment that Microsoft is making and will continue to make in Windows Azure, the core of its cloud strategy. Three fundamental services (Compute, Storage, and Database) are offered to developers to satisfy the basic needs of developing cloud applications. Additionally, with Windows Azure services, (Building Blocks Services, Data Services, Networking, and Marketplace) developers will find it increasingly easy to develop rich and powerful applications. The foundations of this cloud offering are robust and we should continue to look out for new features to be added to this platform.

      References

      This article was written using the following resources as references:

      This post was also published at A Cloud Place blog.


      Himanshu Singh (@himanshuks) posted Real World Windows Azure: Interview with IDV Solutions Vice President Scott Caulk on 5/10/2012:

      imageAs part of the Real World Windows Azure series, I connected with Scott Caulk, vice president of Product Management at IDV Solutions to learn more about how the company uses Windows Azure. Read IDV Solution’s success story here. Read on to find out what he had to say.

      Himanshu Kumar Singh: Tell me about IDV Solutions.

      imageScott Caulk: We provide large organizations with business intelligence, security, and risk visualization solutions. Our flagship product, Visual Fusion, is a business intelligence software solution that helps organizations unite content from virtually any data source and then deliver it to end users in a visual, interactive context for better business insights. Visual Fusion and our other products have helped us establish a strong presence among major organizations in government and private industry sectors, including the U.S. Department of Homeland Security, the U.S. Department of Transportation, Pfizer, Pacific Hydro, BP, and the Thomson Reuters Foundation.

      HKS: What led you to develop Fetch! on Windows Azure?

      imageSC: In 2010, one of our customers asked us for an application that would make it easy for their mobile employees to access the organization’s large data collection. The prototype we created turned out to work really well, and the customer liked it. The application was first deployed entirely on the customer’s servers and was accessed by end users though mobile email; however, our development team decided that a more interactive experience was necessary. To create a more interactive version that maintained device compatibility, we built it as a rich web application.

      What we discovered is that getting the solution to run in an enterprise infrastructure but also exposing it to the Internet began to create risks. We had the foundation of a good idea that could be marketed to other customers as well, but realized that IT departments would worry about data security issues and opening up ports in their firewalls so mobile devices could use the Internet to access internal data.

      So we began looking for a cloud platform that could help provide the essential functionality of providing data access to mobile users while minimizing the exposure risk for corporate data. We chose Windows Azure as the cloud platform on which to develop the app. Called Fetch!, its a hybrid solution that uses cloud capabilities to link mobile users with on-premises enterprise data. As a platform-agnostic mobile app, Fetch! supports the broadest range of common mobile operating systems, including Windows Phone, Android, and iOS, as well as any device capable of sending and receiving email.

      HKS: What capabilities does Fetch! deliver?

      SC: Fetch! allows mobile corporate employees to access a wide range of information such as data grids and text, charts and graphs, documents and images, scorecards, and maps. It supports full access to systems such as IDV Solutions Visual Fusion; Microsoft SharePoint and related PerformancePoint services; Microsoft SQL Server databases; Oracle databases; Salesforce.com; and custom line-of-business systems and web services.

      HKS: What factors led you to choose Windows Azure?

      SC: We’re a member of the Microsoft Partner Network, and have expertise with Windows development tools such as the .NET Framework the Microsoft Visual Studio development system, and ASP.NET. This [experience] gave Windows Azure an advantage in our evaluation process. The tight integration of Windows Azure with our existing development environment made our development efforts go more smoothly.

      Windows Azure also provided a key feature that was invaluable during the development cycle: the Windows Azure Service Bus. The Service Bus provides a hosted, secure, and widely available infrastructure for secure messaging and communications relay capabilities. It offers connectivity options for service endpoints that in other cloud solutions would be difficult or impossible to reach. The Service Bus relay service also eliminates the need to set up a new connection for each communications instance, resulting in faster and more reliable connections for mobile users.

      With Windows Azure, we could jump into the project using our .NET expertise, quickly ramp up, and then deploy a solid app, whether for a smartphone or tablet device. Time to deployment went quickly, and any modifications we have for Fetch! Will be very fast. If we had gone for a non-Microsoft cloud platform, our development time probably would have been slowed by weeks, if not months.

      HKS: How does Fetch! work?

      SC: When accessing data through Fetch!, mobile employees use an email address and password to log on, and a web application provides the means for requesting data. After the user enters a command to query data, the command is processed in Windows Azure and then sent via the Windows Azure Service Bus to a service running within the corporation’s on-premises IT infrastructure. The on-premises service uses “connectors” that are part of the Fetch! solution to link to variety of data sources. The relevant data is collected and then returned using a web service, which formats it and presents it to the user. The speed of the process depends on the particular IT infrastructure, but it typically occurs in just a few seconds.

      Other platform components used in the solution include Windows Azure Storage, which provides scalable and easily accessible data storage services, and Windows Azure Compute, which lets us run application code in the cloud. Each Windows Azure Compute instance runs as a virtual machine that is isolated from other Windows Azure customers and handles activities such as network load balancing and failover for continuous availability. Additionally, Fetch! can connect to data hosted on SQL Azure.

      HKS: What are some of the advantages of using Windows Azure for Fetch!?

      SC: By using Windows Azure as an integral part of Fetch! we were able to use features that ease enterprise customers’ concerns about the security of data accessed from mobile devices. And with Windows Azure Service Bus, our customers can have the Fetch! service running inside their infrastructure without the need to poke holes in their firewall to get data in and out. This is especially important for customers in security fields, where the safety of data is critical. With Service Bus, our customers’ mobile users can connect to rich enterprise information without exposing the network to any additional security concerns.

      HKS: What are some of the benefits of using Windows Azure for your business?

      SC: When developing Fetch!, we weren’t sure if we would have just a few customers or many, including large customers that might add thousands of users at a time to the solution. Windows Azure gives us the scalability to very quickly add large volumes of users as the product is adopted across more and more of our customer base.

      Furthermore, due to their large size, the enterprise customers in our target markets can cause sharp spikes in traffic—sometimes overnight in cases where they add entire groups or departments. This means that Fetch! needs to run on a cloud platform that can deliver enormous scalability at a moment’s notice. That’s one of the big benefits of Windows Azure. We have customers with tens of thousands of users, and we can click a button to go from having two load-balanced web servers to having a dozen or more in a matter of minutes is a very powerful feature for us—and for our customers.

      We also benefit from end-to-end development tools that provide a seamless environment for innovations and upgrades. Windows Azure was key in helping our company build and deliver a mobile, data access app that meets the security and scalability needs of our large customers.


      Leandro Boffi (@leandroboffi) described Building Elastic and Resilient Cloud Applications in a 5/2/2012 post (missed when pubished):

      imageDuring the last months I’ve been collaborating as advisor with the Microsoft Patterns and Practices Team in a very interesting project. They worked on an integration pack for Windows Azure and Enterprise library.

      534475_3170485054475_1035934230_32904838_2136027959_nOne of the outcomes of that work is a book called “Building Elastic and Resilient Cloud Applications”. This book provides background information on autoscaling and transient fault handling which makes it useful even if you don’t want to use the Application Blocks.

      The P&P guys sent me a copy of the book as a gift and they mentioned my name on the list of advisors. I am very proud and thankful of have participated in this project.

      imageYou can find the details on MSDN clicking here.


      <Return to section navigation list>

      Visual Studio LightSwitch and Entity Framework 4.1+

      image_thumb1No significant articles today.


      Return to section navigation list>

      Windows Azure Infrastructure and DevOps

      My (@rogerjenn) Uptime Report for my Live OakLeaf Systems Azure Table Services Sample Project: April 2012 of 5/10/2012 begins:

      imageMy live OakLeaf Systems Azure Table Services Sample Project demo runs two small Windows Azure Web role instances from Microsoft’s South Central US (San Antonio, TX) data center. I didn’t receive (or misplaced) the usual Pingdom Monthly Report for April, so here’s the detailed uptime report from Pingdom.com for April 2012:

      image

      This is the first case of monthly uptime below the 99.95% minimum Windows Azure Compute Service Level Agreement (SLA) guarantee since upgrading the sample app to two instances. My Flurry of Outages on 4/19/2012 for my Windows Azure Tables Demo in the South Central US Data Center post updated 4/30/2012 provides details about the outage on 4/19/2012 and its root cause analysis.

      And continues with a detailed Pingdom response time data for the month of March 2012.


      Joel Foreman offered Ramping Up on the Windows Azure Platform: 200 Level training documents in a 5/11/2012 post to the Slalom Consulting blog:

      imageI was recently asked to put together some material for consultants with the goal of getting to a “200 Level” of knowledge on the Windows Azure Platform and its breadth of capabilities. I thought this would be an opportune time to revamp a previous “getting started” post that I did with some updated content. Below is a 10-hour self-paced training plan, design for bringing someone up to that 200 level…

      Read: Understanding the Different Platform Components (~1 hour)

      imageTake a few minutes to read a brief overview of some of the different features of the Windows Azure Platform. Think of these as building blocks. They can be used individually or together to solve problems and build applications.

      imageSlalom Consultant Joel Forman specializes in cloud computing and the Windows Azure Platform.

      Core Features:

      In addition, take a look at these emerging features to see some of the additional possibilities with the platform:

      Watch: Explore Sessions from the Learn Windows Azure Event (~3 hours)

      Microsoft put on an all-day virtual training event late last year on Windows Azure via Channel9. Explore some of the sessions from notable Microsoft folks such as Scott Guthrie, Dave Campbell, and Mark Russinovich.

      Explore: Scenarios, Case Studies, and Platform Interoperability (~1 hour)

      Explore some additional cloud topics that peak your interest…

      Scenarios: Get a feel for what scenarios are being highlighted for cloud. Are you working in these areas today in your current or previous projects?

      Case Studies: Get a feel for some of the different case studies that exist today. Browse the case study gallery and pick 2-3 to read.

      Interoperability: The Developer Center is a great place to start to get a feel for the different languages and technologies that work with Windows Azure. Take a few minutes to browse around and check out the content for .NET, Java, Node.js, and more. Note all of the tutorials, how-to guides, and documentation of common tasks for the different languages. These can definitely come in handy down the line.

      Develop: Download the Windows Azure Platform Training Kit and Start Developing (~5 hours)

      The Windows Azure Platform Training Kit continues to be one of the best resources for hands-on training. It contains great hands-on labs that are perfect for building your first applications on Windows Azure. Choose from labs around compute, storage, SQL Azure, Access Control Service, and more.

      Recommended Introductory Hands-On Labs:

      • Introduction to Windows Azure
      • Introduction to SQL Azure
      • Introduction to Service Bus
      • Introduction to the AppFabric Access Control Service 2.0

      After spending some time in these areas, you should have a good base level of understanding of the entire platform and its capabilities.


      The Windows Azure Operations Team reported the following [Windows Azure Compute] [North Central US] [Yellow] Windows Azure Outage in North Central US sub-region on 5/11/2012:

        • imageMay 11 2012 4:25AM We are experiencing an issue with all Windows Azure services in the North Central US sub-region. We are actively investigating this issue and working to resolve it as soon as possible. Further updates will be published to keep you apprised of the situation. We apologize for any inconvenience this causes our customers.
        • May 11 2012 5:25AM We have traced down the root cause of this outage to a faulty networking device and we are working on mitigating the impact. Further updates will be published to keep you apprised of the situation. We apologize for any inconvenience this causes our customers.
        • May 11 2012 6:25AM We have isolated the faulty networking device at 11:04 PM PST and have observed network traffic improvement in the North Central US sub-region. Full restoration of the network traffic is still being validated. Further updates will be published to keep you apprised of the situation. We apologize for any inconvenience this caused our customers.
        • May 11 2012 7:25AM We are still observing some network traffic disruption in the North Central US sub-region, down to less than 3% packet loss across a limited set of clusters in this sub-region, and continue to validate the mitigation in place. Further updates will be published to keep you apprised of the situation. We apologize for any inconvenience this causes our customers.
        • May 11 2012 8:25AM We are still observing intermittent and limited network traffic disruption in the North Central US sub-region, but the potential customer impact is now very low. We have engaged in the repair steps on the networking device causing the traffic disruption. Further updates will be published to keep you apprised of the situation. We apologize for any inconvenience this causes our customers.
        • May 11 2012 9:25AM The repair steps are still underway on the networking device that caused the traffic disruption. Network traffic has been steady at 100% with no packet loss since the previous notification. Further updates will be published to keep you apprised of the situation. We apologize for any inconvenience this causes our customers.
        • May 11 2012 9:30AM The repair steps have been executed and successfully validated. Network traffic has been fully restored in the North Central US sub-region. We apologize for any inconvenience this caused our customers.

      Slow news day.


      Himanshu Singh (@himanshuks) posted Windows Azure Community News Roundup (Edition #18) to the Windows Azure Team blog on 5/11/2012:

      imageWelcome to the latest edition of our weekly roundup of the latest community-driven news, content and conversations about cloud computing and Windows Azure. Here are the highlights from this week.

      Articles and Blog Posts

      Upcoming Events, and User Group Meetings

      North America

      Europe

      Other

      Recent Windows Azure Forums Discussion Threads

      Send us articles that you’d like us to highlight, or content of your own that you’d like to share. And let us know about any local events, groups or activities that you think we should tell the rest of the Windows Azure community about. You can use the comments section below, or talk to us on Twitter @WindowsAzure.


      Valery Mizonov updated his Cloud Application Framework & Extensions (CloudFx) v1.2.0.16 Nuget package on 5/10/2012:

      imageThe Cloud Application Framework & Extensions (CloudFx) is a Swiss Army knife for Windows Azure developers which offers a set of production quality components and building blocks intended to jump-start the implementation of feature-rich, reliable and extensible Windows Azure-based solutions and services.

      imageTo install Cloud Application Framework & Extensions (CloudFx), run the following command in the Package Manager Console

      PM> Install-Package Microsoft.Experience.CloudFx

      Valery also added documentation for CloudFx Samples:

      Introduction

      The following sample code accompanies the Cloud Application Framework & Extensions (CloudFx) library and demonstrates some basic "How-To" style samples as well as advanced samples showing several CloudFx features altogether.

      The list of samples is growing, so please check back later if you are interested in any specific scenarios not currently covered. Please also consider using the "Questions & Answers" section to submit your questions, feedback or general comments.

      What is CloudFx?

      The Cloud Application Framework & Extensions (CloudFx) library is comprised of production-quality components and building blocks created with a single goal in mind: to enable Windows Azure developers to build high-end cloud-based solutions without having to understand as many of the relatively complex mechanical tasks each platform service API involves.

      CloudFx implements a collection of patterns derived from real customer solutions to provide professional features at reduced complexity, which results in faster development of robust and performant Windows Azure applications.

      For example, when working with Windows Azure Queues, CloudFx provides a pair of simplePut and Get (send and receive) actions that understand the application’s specific business objects, reliably and efficiently invoke the underlying storage client API, handle errors, process messages that don’t fit on a queue, and so on.

      For those exploring Windows Azure Service Bus, CloudFx offers straightforward Publish and Subscribe operations that make the most efficient use of the asynchronous messaging APIs and implement all the relevant best practices so that the developer doesn’t have to focus on low-level plumbing and can concentrate on business domain.

      The example of how the CloudFx framework was used in the implementation of a real-world complex cloud-based solution can be foundhere.


      David Linthicum (@DavidLinthicum) asserted “A recent survey from Cisco finally tells us what we suspected all along” in a deck for his This just in: Cloud computing is hard and takes a long time post of 5/10/2012 for InfoWorld’s Cloud Computing blog:

      imageCisco Systems has surveyed more than 1,300 IT professionals to determine the top priorities and challenges they face when migrating applications and information to the cloud. Guess what? It's harder, and it takes longer than many thought.

      Duh.

      imageOf course, these surveys have a tendency to be self-serving, so it's no surprise that this one concludes that your networks need upgrading before you can move to the cloud. After all, the survey was sponsored by a networking company.

      But putting aside the obvious self-promotion, the broader conclusions confirm what many of us have suspected for some time and what anyone considering a cloud migration must understand: It's not easy. Cloud computing is a challenge that takes longer than most organizations have budgeted.

      For example, only 5 percent of IT decision makers surveyed have been able to migrate at least half of their total applications to the cloud. I'm not sure it's even that much in the world at large, given what I've seen in my travels. However, the survey states that by 2013, that number is expected to significantly rise, as more than one in five will migrate at least half of their total applications to the cloud.

      The survey also captures the difficulty through humor, with conclusions such as, "More than one-quarter said they could train for a marathon or grow a mullet in a shorter period of time than it would take to migrate their company's applications to the cloud" or "nearly one-quarter of IT decision makers said that, over the next six months, they are more likely to see a UFO, a unicorn, or a ghost before they see their company's cloud migration starting and finishing."

      The core reason for the difficulty is, of course, the fact that moving to the cloud is a platform-migration problem -- in this case from traditional systems to private and public clouds. IT pros already know migrations are always problematic, especially if you're considering business-critical systems. But why businesses haven't equated cloud migration to every other migration is a mystery, perhaps the result of "silver bullet" sales claims by vendors.

      What makes the migration to the cloud even more difficult is the lack of information about the process. Many new cloud users are lost in a sea of hype-driven desire to move to cloud computing, without many proven best practices and metrics.

      I suspect we're seeing the start of the pain, but I believe things will improve as we learn how to work through the problems of migrating to the cloud. I've already started growing my mullet.


      <Return to section navigation list>

      Windows Azure Platform Appliance (WAPA), Hyper-V and Private/Hybrid Clouds

      Gavin Clarke (@gavin_clarke) quoted the “HP cloud chief: The hard work and zen of Windows open source” in an introduction to his Lessons for HP box jockeys on the Amazon warpath article of 5/12/2012 for The Register:

      imageYou might think the hard work for Hewlett-Packard is done, after it came from behind to build its own Amazon-style cloud so quickly. But the difficult part – taking on Amazon and winning with open source – lies ahead.

      The world's biggest PC maker lifted the lid yesterday on the biggest change to its business in recent history.

      imageThe HP Cloud Compute free beta ended on Thursday with a rash of 40 partners announcing that they are all now available on or in support of HP's cloud – Rightscale, ActiveState, CloudBees, Dome9, EnterpriseDB and others. The period of construction and half-priced sign-ups is finished. Now it's down to business.

      Forget former chief executive Carly Fiorina's folly of buying Compaq in 2002 for $25bn: that just added more computers to the existing line-up. And Mark Hurd's $13.9bn EDS buy in 2008? Services based on servers and, yup, PCs.

      HP Cloud Compute is a leap from terra firma into a blue-sky world. HP is hoping to sell something frighteningly intangible for a bunch of box jockeys who feel rather more comfortable with product in, product out, margin and mark-up. Now they are selling compute cycles, storage capacity and bandwidth: stuff you can't see, touch or mark up.

      A tweet from April's OpenStack design conference in San Francisco, California, hinted at the scale of what HP is selling: 2,000 nodes, "multi-petabytes" of Swift storage capacity and "pricing to match AWS". HP won't comment on the these numbers.

      Stacking up

      HP is doing all this using a piece of software it can't claim to own and that, in itself, doesn't confer any kind of competitive advantage. HP Cloud Compute uses the open-source cloud architecture called OpenStack – in which it is a part player along with nearly 200 other companies also fighting for a place in a gold rush era in the history of computing. Dell is putting out OpenStack reference implementations, NTT is building furiously with OpenStack and AT&T has joined up.

      Worse – yes, there is worse to come for Palo Alto's box monkeys – the PC maker, along with all those other OpenStackers, is going up against Amazon's EC2, the critical mass monster of cloud.

      Amazon's cloud earned the bookseller an estimated $1.08bn in the first nine months of last year – up 70.4 per cent compared to 2010 – while the amount of data Amazon holds hit 762 billion objects, more than doubling last year.

      It will get bigger as Amazon becomes an infrastructure for other clouds such as Heroku and makes it easier for enterprise customers to upload their data and embrace .NET to pull in Microsoft shops.

      HP's road to cloud has been laid down pretty quickly, if a little late. It only joined the OpenStack project relatively late – in July last year. That was after 90 others, including Rackspace, a co-founder which had been there 12 months before. Amazon's EC2 floated in 2006 while Microsoft opened Azure in early 2010.

      HP has also delivered in spite – yes, in spite – of a partnership with its long-time PC and server operating system buddy Microsoft. The computer maker is more than just one of Redmond's biggest partners on PCs, servers and services; it was supposed to be on the inside of the cloud tent too – as an early adopter of Microsoft's cloud, Windows Azure.

      imageHP, Dell and Fujitsu were named by Microsoft in 2010 as planning appliances running the Windows Azure code that would slot in to their own data centres as well as those of their customers. The three firms – along with eBay – were also supposed to launch services running Windows Azure appliances in their data centres. (Emphasis added.)

      Yet two years later, HP has started a cloud service, and it doesn't run Windows Azure, it uses OpenStack instead. Bear in mind that HP only joined OpenStack one year after the supposed Microsoft Windows Azure deal.

      HP is now delivering OpenStack compute, storage and networking running on HP servers, storage and network equipment. Also, the operating system running HP's cloud is Linux - versions of Ubuntu, Fedora and CentOS can be spun up as virtual instances.

      What happened? Why did the cloud business side of HP go for open source when the PC side dumped the open-source client-side experiment of previous CEOs Hurd and Apotheker – which began with the purchase of Palm for $1.2bn – and went for Windows?

      Slow Windows

      imageZorawar Biri Singh

      , the HP executive in charge HP's cloud services, who joined from IBM in early 2010 to float the service, told The Reg this week that open source and OpenStack, licensed under an Apache licence, allowed HP to get up and running faster.

      He said: "In the year I've been here, my emphasis has been to get our core infrastructure up ... For HP it was about getting the infrastructure to run the stuff on and have a compelling platform."

      Singh reckons HP remains committed to Windows Azure, but the language sounds clipped, as it seems Windows Azure will be a guest on OpenStack – a platform as a service (PaaS) – instead of providing the underling infrastructure as a service (IaaS). Also, its Windows Server will likely become just another guest alongside the Linuxes.

      Redmond not out

      "We are still working very closely with Microsoft," Singh assures The Reg. "Windows Azure is very much on our roadmap. We will talk about it shortly – we had our hands full. Microsoft is a great partner and we will leverage that."

      Singh reckons Windows Azure was PaaS from the get-go, but that's not how Microsoft spun it in 2010, when the idea was for Windows Azure to run on those appliances. "PaaS might have been premature."

      Let's assume that as OpenStack is under an open-source licence, HP was automatically granted the freedom to customise and tune the OpenStack code – which it couldn't have done if it had worked with the proprietary and Microsoft-controlled Windows Azure. Fine, but the real problem is there's a dearth of skilled engineers out there who know how to actually program OpenStack code.

      Those who do know much about it are busy setting up their own start-ups – companies like Piston Cloud. Piston Cloud's chief technology officer Josh McKenty was chief architect for NASA on the Nebula component that comprised one half of 2010's original OpenStack announcement with Rackspace.

      HP has been hiring in an attempt to close the gap: in September 2011 Singh brought on board MySQL architecture director and Drizzle lead architect Brian Aker as an HP fellow, with special responsibility for platform-as-a-service and development-as-a-service engineering.

      MySQL was sold by Oracle to enterprises and OEMs but runs at web scale inside Facebook and Twitter. Aker, in turn, has recruited engineers in MySQL, Drizzle, Scala, Postgres, Ruby on Rails, Python and Open Stack. HP went on to announce MySQL-as-a-service on its cloud.

      A lot of the hard work has taken place in the basic OpenStack Nova compute and Swift storage code, Singh says, especially tuning Nova to manage compute frameworks and to transfer information. "Not too many people have the experience or will of conviction to dive in - there's a ton of hard work. It's really difficult, but folks are committed to it," he said.

      Singh says HP is still learning how to work with OpenStack but is bringing to bear its experience in large data centres. "We understand how to serve large web 2.0 customers doing scale. We've taken that hardware IP software stack and converted it to things in open source like OpenStack," he said.

      No billion-dollar data centres

      Singh declined to comment on the numbers of customers, nodes or petabytes mentioned in the tweet but reckoned HP is building on a "global scale". "This has lots to do with how do you deal with scale and how to you scale to hundreds, thousands and hundreds of thousands of nodes across geographies and giving it the robustness and service quality that customers expect over time. We are in the early stages of figuring that out," he said.

      HP won't build out its own data centres to achieve that, unlike Microsoft or Amazon, he said. Rather the company will use its existing facilities and exploit relationships with service providers who will install HP servers and reference architectures running OpenStack.

      Singh is particularly excited by the HP EcoPOD, a take on the old Sun Microsystems trailer-computing idea that you drop off and remove by truck, according to demand.

      "You won't see HP invest billions of dollars in data centres next to a dam because those are built for specific workloads, like search clusters. You will see us populate existing data centres [and] provide eco pods to rapidly stand up to meet local demand," Singh said.

      Read more: Beating the beasts, 2, Next →


      Travis Wright (@radtravis) reported Open Beta for Private Cloud MOF Guide - Now Available for Download! on 5/10/2012:

      Managing and Operating a Microsoft Private Cloud—How to Apply the Microsoft Operations Framework (MOF)

      imageThe Microsoft Operations Framework team is working on a new guide: Managing and Operating a Microsoft Private Cloud—How to Apply the Microsoft Operations Framework.

      Get the beta here.

      clip_image001

      This guide leads you through the process of how to manage and operate a Microsoft private cloud using the service management processes of the Microsoft Operations Framework (MOF). The guide applies MOF’s IT service management principles to that conceptual architecture and technology stack. It describes how to maximize the potential of MOF’s people, process, and technical capabilities to manage and operate a Microsoft private cloud.

      Follow this guidance for a private cloud that is better aligned to meet your business needs. Employ MOF’s service management functions (SMFs) to help align IT and business goals, which can enable you to perform private cloud activities effectively and cost-efficiently.

      This guide focuses on the SMFs in the Operate Phase and the Manage Layer of MOF to give IT pros and managers what they need to know about managing and operating a private cloud. Management reviews—internal controls that ensure goals are met to achieve business value—are also included.

      Tell us what you think! Download and review the beta guide, then send your feedback to mofpm@microsoft.com by June 11, 2012. We would especially appreciate feedback in the following areas:

      · Usefulness – Is the technical depth of this guide sufficient for the topics covered? Will this guide be useful to you on a day-to-day basis? What portions of the guide are the most useful to your organization?

      · Usability – Is the structure or flow of this guide effective? Is the information presented in a clear and logical manner? Can you easily find key content?

      · Impact – Do you anticipate that this guide will save you time and accelerate deployment of Microsoft products in your organization? Has this guide had a positive influence on your opinion of the Microsoft technologies it addresses?

      Benefits for participation:

      · You get an early look at the guide.

      · You will be listed on the acknowledgments page for providing usable feedback.

      We look forward to hearing from you! Your input helps to make each guide as helpful and useful as possible. Thanks in advance for taking the time to review Managing and Operating a Microsoft Private Cloud—How to Apply the Microsoft Operations Framework (MOF).

      clip_image003

      Subscribe to the MOF beta program and we will notify you when new beta guides become available for your review and feedback. These are open beta downloads. If you are not already a member of the MOF Beta Program and would like to join, follow these steps:

      1. Go here to join the MOF beta program:

      https://connect.microsoft.com/site14/InvitationUse.aspx?ProgramID=1880&InvitationID=MOFN-M6H9-PV3X

      If the link does not work for you, copy and paste it into the web browser address bar.

      2. Sign in using a valid Windows Live® ID.

      3. Enter your registration information.

      4. Continue to the MOF program beta page, scroll down to Microsoft Operations Framework, and click the link to join the MOF beta program.

      Please send your comments and feedback to mofpm@microsoft.com.


      The Microsoft Server and Cloud Platform Team announced System Center Cloud Services Process Pack RTM Now Available on 5/10/2012:

      imageThe Microsoft Solution Accelerators team is pleased to announce that the System Center Cloud Services Process Pack is now available for download.

      The System Center Cloud Services Process Pack is Microsoft’s Infrastructure as a Service solution built on the System Center platform. With the System Center Cloud Services Process Pack, enterprises can realize the benefits of Infrastructure as a Service while simultaneously leveraging their existing investments in the Service Manager, Orchestrator, Virtual Machine Manager, and Operations Manager platforms.

      Read more about this new process pack on the System Center Service Manager Team Blog.


      <Return to section navigation list>

      Cloud Security and Governance

      Maureen O’Gara asserted “Intel expects more than 3B connected users & 15B connected devices to drive more than 1,500 exabytes of cloud traffic by 2015” in a deck for her Intel & McAfee on Mission to End Cloud Nail-Biting post of 5/11/2012 to the Cloud Security Journal blog:

      Security concerns are the biggest thing holding back cloud adoption, but Intel says it'll take it and its pricey $7.68 billion McAfee acquisition at least another five years to bring cloud security up to the best-in-class traditional enterprise security available now.

      Not very reassuring is it.

      imageIntel also says, "A private cloud added to your IT infrastructure is like adding another door to your house - it's another entry point for bad guys to get in."

      Oh, great, just what we need.

      Despite those sobering thoughts Intel is still expecting more than three billion connected users - that's a B, dear, as in billion - and 15 billion connected devices to be driving more than 1,500 exabytes of cloud traffic by 2015, and IDC figures that about 20% of all digital data - roughly, say, 1,400 exabytes - a mammoth load - will be stored or processed in the cloud.

      imageThat's the pickle - and since the daring and adventurous can never be kept at home - Intel and McAfee the other day sketched out what they could do now using existing technology to prevent companies from running off the cliff and their sensitive data from falling into unsavory hands.

      They also said a happy word - but not much more than that - about what still needs to be done.

      The object of the game is to entice enterprises still hesitant to take the plunge.

      The twosome has got a two-pronged approach pairing software with hardware widgetry while they embark on a "multi-year mission" to reassure IT departments that:

      • Data, applications and infrastructure will be secure.
      • Corporate compliance requirements will be automatically met.
      • Corporate security policies will be automatically applied throughout the workload lifecycle.
      • And easy-to-implement solutions will provide 24/7 reporting.

      Heady promises under the circumstances.

      Intel commissioned a study that found that 61% of IT professionals are concerned about the loss of visibility in private clouds; 55% are concerned about lack of data protection in public clouds; and 57% won't put data that requires specific compliance into cloud data centers.

      It figures it has to overcome these immediate reservations because it has to sell the cloud to make money.

      Okay, so Intel and McAfee say they've got a fashionably holistic approach to cloud security to "establish" confidence in using private, public and hybrid clouds.

      That basically means:

      • Securing cloud data centers. Cloud infrastructure is typically virtualized and shared across multiple lines of business or even multiple organizations, which reduces control and visibility into infrastructure security, a situation that's amplified in off-premise public cloud infrastructure managed by third parties. Using McAfee's existing ePolicy Orchestrator (ePO) - which sets security policies across physical, virtual and cloud environments - with Intel's hardware-based Trusted Execution Technology (TXT), "trustworthy" Xeon E5 servers can be identified. Without saying exactly how, Intel and McAfee developments are expected to strengthen data protection, security enforcement and auditability across cloud infrastructures.
      • Securing network connections. Multiple passwords are a hazard and e-mail and web traffic flowing between remote offices and mobile devices used by employees can be a significant source of data leakage. McAfee's Cloud Security Platform can improve security. It's supposed to evolve to protect cloud infrastructure via better integrity assessments, provide asset control and protection, and enable broader auditing and network security capabilities.
      • Securing the devices that connect to cloud services. BYOD has created a big security problem for the cloud. McAfee Deep Defender is a joint next-generation hardware-enhanced endpoint security solution that uses the Intel Virtualization Technology (Intel VT) built into Core i3, i5 and i7 processors to detect and delete low-level threats, in real-time, that are otherwise difficult to detect with traditional operating system-based security techniques. McAfee's Cloud Identity Manager provides on-premise software-based access control for cloud applications using enterprise identities, and Intel Cloud SSO (single sign-on) provides "identity-as-a-service" in the cloud. Future developments are supposed to focus on advancing data and identity protection for devices accessing cloud services.
      • Lastly, developing standards for cloud security. Intel and McAfee are involved in the Cloud Security Alliance and Open Data Center Alliance to enable open interoperable solutions for cloud security.

      It's all supposed to result in a "worry-free" cloud.

      By the way, according to the figures Intel has rounded up private clouds currently have 14% penetration expected to go to 42% by 2014 and public clouds currently have 7% penetration going to 23% by 2014.


      <Return to section navigation list>

      Cloud Computing Events

      Eric D. Boyd (@EricDBoyd) reported The Midwest’s Cloud Conference – CloudDevelop 2012 will take place 8/3/2012 in Columbus, OH:

      imageThose who know me, know that I am passionate about Cloud Computing. I speak at and attend many Cloud conferences, but typically these conferences are on the East or West coast. This summer, Michael Collier, Jared Farris, Brian Prince and myself are launching the Midwest’s premier Cloud Computing conference, CloudDevelop 2012.

      imageCloudDevelop 2012 will be held on Friday, August 3rd in Columbus, OH at The Ohio State University’s Ohio Union. CloudDevelop is a vendor and technology neutral conference with sessions on various cloud technologies and cloud application development. We feel that a mixture of vendors, technologies and topics, will provide a dynamic, informative and engaging conference for all attendees.

      Today is the last call for speaker submissions and if you are a subject matter expert or are doing something interesting in the Cloud with Windows Azure, Amazon AWS, Heroku, AppHarbor, Google AppEngine, Rackspace Cloud or other cloud platforms and technologies, submit your sessions today! http://clouddevelop.org/SubmitProposal.html

      Conferences like CloudDevelop wouldn’t be possible without the support from generous sponsors. If you are a cloud provider, tool vendor or just someone interested in sponsoring CloudDevelop, there are still some slots available and you can download the sponsorship prospectus to learn more about the opportunities. http://clouddevelop.org/CloudDevelopProspectus2012.pdf


      The IEEE announces IEEE Cloud 2012, the 5th International Conference on Cloud Computing, to be held 6/25 to 6/29/2012 at the Hyatt Regency Waikiki Resort and Spa, Honolulu, HI:

      imageChange we are leading” is the theme of CLOUD 2012. Cloud Computing has become a scalable services consumption and delivery platform in the field of Services Computing. The technical foundations of Cloud Computing include Service-Oriented Architecture (SOA) and Virtualizations of hardware and software. The goal of Cloud Computing is to share resources among the cloud service consumers, cloud partners, and cloud vendors in the cloud value chain. The resource sharing at various levels results in various cloud offerings such as infrastructure cloud (e.g. hardware, IT infrastructure management), software cloud (e.g. SaaS focusing on middleware as a service, or traditional CRM as a service), application cloud (e.g. Application as a Service, UML modeling tools as a service, social network as a service), and business cloud (e.g. business process as a service). Extended versions of selected research track papers will be invited for potential publication in the IEEE Transactions on Services Computing (TSC), International Journal of Web Services Research (JWSR), and International Journal of Business Process Integration and Management (IJBPIM). Both TSC and JWSR are indexed by SCI and EI [Link]. CLOUD Proceedings are EI indexed. According to Thomson Scientific, JWSR is listed in the 2008 Journal Citation Report with an Impact Factor of 1.200. The journal ranks #47 of 99 in the Computer Science, Information Systems and ranks #37 of 86 in Computer Science, Software Engineering.

      imageUnder the umbrella of the IEEE 2012 World Congress on Services (SERVICES 2012), CLOUD 2012 will co-locate with the following service-oriented sister conferences: the 19th IEEE 2012 International Conference on Web Services (ICWS 2012), the 9th IEEE 2012 International Conference on Services Computing (SCC 2012), the 1st IEEE International Conference on Mobile Services (MS 2012), and the 1st IEEE International Conference on Services Economics(SE 2012). The five co-located theme topic conferences will all center around "services," while each focusing on exploring different aspects (cloud-based services, web-based services, business services, mobile services, and economics of services).

      To discuss this emerging enabling technology of the modern services industry, CLOUD 2012 invites you to join the largest academic conference to explores modern services and software sciences in the field of Services Computing, which was formally promoted by IEEE Computer Society since 2003. From technology foundation perspective, Services Computing has become the default discipline in the modern services industry.

      Sounds to me like a junket, but I’d prefer laid-back Kauai or Maui, not Waikiki madness.


      <Return to section navigation list>

      Other Cloud Computing Platforms and Services

      Barb Darrow (@gigabarb) asked SAP cuddles up with Amazon, but what about Azure? to GigaOm’s Structure blog on 5/11/2012:

      imageNews that SAP will — along with Amazon — certify more of its business applications to run in production on Amazon’s public cloud raises the question of what’s going on with SAP and Microsoft Windows Azure.

      As one commenter to our SAP-Amazon post last night put it: “So much for SAP announcing MS as partner of the year! why isn’t Azure in the pic ?”

      Good question.

      imageAlmost exactly a year ago at SAP’s Sapphire annual conference, Microsoft and SAP pledged better .Net and Azure integration with SAP’s NetWeaver development tools. As V3.co.uk reported at the time, the two companies said:

      imageMicrosoft will add integration features into the Visual Studio platform to improve .NET support for SAP applications. Additionally, the companies will extend the NetWeaver Gateway developer tool to Microsoft’s Azure cloud computing platform, and extend SAP support to Microsoft’s Hyper-V virtualisation platform and the Microsoft System Center.

      imageBut there has been no statement about SAP developing or hosting applications on Azure. Of course the Sapphire 2012 conference next week in Orlando, Fla. would be a perfect venue for an announcement. Microsoft will have a booth at the show and Amazon’s announcement late last night could have been a preemptive move.

      imageBefore people start barking — yes Azure is a PaaS not IaaS so there’s a different equation here. But, SAP has been a close Microsoft partner since 1993 and having SAP-branded business-critical apps running on Azure would be a coup. Of course taking existing code bases and rewriting them for Azure is a huge job whereas Amazon’s infrastructure can typically run existing code. It’s also unclear how much traction Azure has gotten among enterprise customers or generally. As Stacey reported a few weeks ago, Microsoft seems to be focusing on startups with Azure of late.

      imageI’ve reached out to both Microsoft and SAP for comment and will update this as new information comes in. But to sum up: Microsoft has invested a ton of time, money and energy in Azure — which went live in February 2010 — but which has yet to draw many A-list software vendors. One reason is that most A-list software vendors – including SAP — also compete with Microsoft so the calculus can be difficult for Azure.


      Richard Seroter (@rseroter) asked Is AWS or Windows Azure the Right Choice? It’s Not That Easy. on 5/10/2012:

      imageI was thinking about this topic today, and as someone who built the AWS Developer Fundamentals course for Pluralsight, is a Microsoft MVP who plays with Windows Azure a lot, and has an unnatural affinity for PaaS platforms like Cloud Foundry / Iron Foundry and Force.com, I figured that I had some opinions on this topic.

      imageSo why would a developer choose AWS over Windows Azure today? I don’t know all developers, so I’ll give you the reasons why I often lean towards AWS:

      • imagePace of innovation. The AWS team is amazing when it comes to regularly releasing and updating products. The day my Pluralsight course came out, AWS released their Simple Workflow Service. My course couldn’t be accurate for 5 minutes before AWS screwed me over! Just this week, Amazon announced Microsoft SQL Server support in their robust RDS offering, and .NET support in their PaaS-like Elastic Beanstalk service. These guys release interesting software on a regular basis and that helps maintain constant momentum with the platform. Contrast that with the Windows Azure team that is a bit more sporadic with releases, and with seemingly less fanfare. There’s lots of good stuff that the Azure guys keep baking into their services, but not at the same rate as AWS.
      • Completeness of services. Whether the AWS folks think they offer a PaaS or not, their services cover a wide range of solution scenarios. Everything from foundational services like compute, storage, database and networking, to higher level offerings like messaging, identity management and content delivery. Sure, there’s no “true” application fabric like you’ll find in Windows Azure or Cloud Foundry, but tools like Cloud Formation and Elastic Beanstalk get you pretty close. This well-rounded offering means that developers can often find what they need to accomplish somewhere in this stack. Windows Azure actually has a very rich set of services, likely the most comprehensive of any PaaS vendor, but at this writing, they don’t have the same depth in infrastructure services. While PaaS may be the future of cloud (and I hope it is), IaaS is a critical component of today’s enterprise architecture.
      • It just works. AWS gets knocked from time to time on their reliability, but it seems like most agree that as far as clouds go, they’ve got a damn solid platform. Services spin up relatively quickly, stay up, and changes to service settings often cascade instantly. In this case, I wouldn’t say that Windows Azure doesn’t “just work”, but if AWS doesn’t fail me, I have little reason to leave.
      • Convenience. This may be one of the primary advantages of AWS at this point. Once a capability becomes a commodity (and cloud services are probably at that point), and if there is parity among competitors on functionality, price and stability, the only remaining differentiator is convenience. AWS shines in this area, for me. As a Microsoft Visual Studio user, there are at least four ways that I can consume (nearly) every AWS service: Visual Studio Explorer, API, .NET SDK or AWS Management Console. It’s just SO easy. The AWS experience in Visual Studio is actually better than the one Microsoft offers with Windows Azure! I can’t use a single UI to manage all the Azure services, but the AWS tooling provides a complete experience with just about every type of AWS service. In addition, speed of deployment matters. I recently compared the experience of deploying an ASP.NET application to Windows Azure, AWS and Iron Foundry. Windows Azure was both the slowest option, and the one that took the most steps. Not that those steps were difficult, mind you, but it introduced friction and just makes it less convenient. Finally, the AWS team is just so good at making sure that a new or updated product is instantly reflected across their websites, SDKs, and support docs. You can’t overstate how nice that is for people consuming those services.

      That said, the title of this post implies that this isn’t a black and white choice. Basing an entire cloud strategy on either platform isn’t a good idea. Ideally, a “cloud strategy” is nothing more than a strategy for meeting business needs with the right type of service. It’s not about choosing a single cloud and cramming all your use cases into it.

      A Microsoft shop that is looking to deploy public facing websites and reduce infrastructure maintenance can’t go wrong with Windows Azure. Lately, even non-Microsoft shops have a legitimate case for deploying apps written in Node.js or PHP to Windows Azure. Getting out of infrastructure maintenance is a great thing, and Windows Azure exposes you to much less infrastructure than AWS does. Looking to use a SQL Server in the cloud? You have a very interesting choice to make now. Microsoft will do well if it creates (optional) value-added integrations between its offerings, while making sure each standalone product is as robust as possible. That will be its win in the “convenience” category.

      While I contend that the only truly differentiated offering that Windows Azure has is their Service Bus / Access Control / EAI product, the rest of the platform has undergone constant improvement and left behind many of its early inconvenient and unstable characteristics. With Scott Guthrie at the helm, and so many smart people spread across the Azure teams, I have absolutely no doubt that Windows Azure will be in the majority of discussions about “cloud leaders” and provide a legitimate landing point for all sorts of cloudy apps. At the same time though, AWS isn’t slowing their pace (quite the opposite), so this back-and-forth competition will end up improving both sets of services and leave us consumers with an awesome selection of choices.

      What do you think? Why would you (or do you) pick AWS over Azure, or vice versa?


      Simon Munro (@simonmunro) asserted Microsoft licenses the crown jewels to Amazon Web Services in a 5/10/2012 post:

      imageWhen Amazon announced RDS for SQL Server and .NET support for Elastic Beanstalk, the response over the next few hours and days was a gushy ‘AWS cosies up to .NET developers’ or something similar. My first thought upon reading the news was “Man, some people on the Azure team must be really, really pissed at the SQL Server team for letting SQL Server on to AWS”. It’s not that AWS is not a good place for .NET people to cosy up to, and some AWS people are very cosy indeed (except for one, who’s been avoiding me for a year), but .NET people getting friendly with AWS people is bad for Azure. While it is great for .NET developers, the problem for Microsoft is that SQL RDS erodes the primary competitive advantage of Windows Azure.

      imageAWS has been a long time supporter of the Windows and .NET ecosystem but the missing part was the lack of a good story around SQL Server. Sure, you have always been able to roll your own SQL instance, but keeping it available and running is a pain. What was lacking, until this week, was a SQL Server database service that negated the need to muck around by yourself. What was needed was a service provided by AWS that you could just click on to enable. Not only does AWS now support SQL (although not 2012 yet) it seems to superficially offer a better SQL than Microsoft does on SQL Azure. I personally think that SQL Azure is a better product and has been developed, from the ground up, specifically for a cloud environment, but that process has left it somewhat incompatible with on-premise SQL Server. AWSs RDS SQL is plain ‘ol SQL Server that everyone is familiar with, with databases bigger than 150GB, backups, performance counters and other things that are lacking in SQL Azure. While the discerning engineer may understand the subtle edge that SQL Azure has over RDS SQL, it will be completely lost on the decision makers.

      AWS has recently been making feints into the enterprise market, a stalwart of more established vendors, including Microsoft. And, if AWS want to present a serious proposition to enterprise customers, they have to present a good Windows/.NET story without gaps — and it seems that they are beginning to fill in those gaps. It is particularly interesting and compelling for larger enterprises where there is a mish-mash of varied platforms, as there inevitably are in large organisations, where one cloud provider is able to take care of them all.

      Windows Azure has Windows/.NET customer support at the core of its value proposition and SQL Azure is a big part of that. If you have a need for SQL Server functionality, why go to anyone other than a big brand that offers it as part of their core services (and I mean ‘service’, not just ‘ability to host or run’)? Windows Azure was that big brand offering that service, where the customer would choose it by default because of SQL support. Well, now there is another big brand with a compelling offering.

      Microsoft obviously can’t go around refusing licenses for their software, and for a business that for decades has had ‘sell as many licenses as possible’ as their most basic cheerleader chant, it is virtually impossible to not sell licenses. The models for the new world of cloud computing clash right here with the old business models that Microsoft is struggling to adapt. For an organisation that is ‘all in’ on the cloud, the only ‘all in’ part of the messages that I am getting is that Microsoft wants to sell as many licenses of their products to cloud providers as possible — putting Windows Azure in a very awkward position. If it was me in the big Microsoft chair, I would have fought SQL RDS as long as possible — but hey, I’m not a highly influential sweaty billionaire, so my opinion doesn’t count and won’t make me a sweaty billionaire either.

      The competitor to Windows Azure is not AWS, or AppEngine or any other cloud provider — the competitor is Windows Server, SQL Server and all the on-premise technologies that their customers are familiar with. I’m sure that Microsoft desperately wanted to get SQL onto RDS and helped as much as they could because that is what their customers were asking for (Microsoft is apparently quite big on listening to customers). I can’t help thinking that every time Microsofties went over for a meeting at the Amazon office to hammer out the details, the Azure team was left clueless in Redmond and the Amazon staff were chuckling behind their backs.

      How does Microsoft reconcile their support for Windows Azure and their support for their existing customers and business models? How do they work with AWS as one of their biggest partners and competitors? While Microsoft struggles with these sorts of questions and tries to decide where to point the ship, Amazon will take whatever money it can off the table, thank you very much.


      Jeff Barr (@jeffbarr) described how to Monitor Estimated Charges Using Billing Alerts in a 5/10/2012 post:

      Introduction
      imageBecause the AWS Cloud operates on a pay-as-you-go model, your monthly bill will reflect your actual usage. In situations where your overall consumption can vary from hour to hour, it is always a good idea to log in to the AWS portal and check your account activity on a regular basis. We want to make this process easier and simpler because we know that you have more important things to do.

      To this end, you can now monitor your estimated AWS using our new billing alerts, which Amazon CloudWatch metrics and alarms.

      What's Up?
      imageWe regularly estimate the total monthly charge for each AWS service that you use. When you enable monitoring for your account, we begin storing the estimates as CloudWatch metrics, where they'll remain available for the usual 14 day period. The following variants on the billing metrics are stored in CloudWatch:

      • Estimated Charges: Total
      • Estimated Charges: By Service
      • Estimated Charges: By Linked Account (if you are using Consolidated Billing)
      • Estimated Charges: By Linked Account and Service (if you are using Consolidated Billing)

      You can use this data to receive billing alerts (which are simply Amazon SNS notifications triggered by CloudWatch alarms) to the email address of your choice. Since the notifications use SNS, so you can also route them to your own applications for further processing.

      It is important to note that these are estimates, not predictions. The estimate approximates the cost of your AWS usage to date within the current billing cycle and will increase as you continue to consume resources. It includes usage charges for things like Amazon EC2 instance-hours and recurring fees for things like AWS Premium Support. It does not take trends or potential changes in your AWS usage pattern into account.

      So, what can you do with this? You can start by using the billing alerts to let you know when your AWS bill will be higher than expected. For example, you can set up an alert to make sure that your AWS usage remains within the Free Usage Tier or to find out when you are approaching a budget limit. This is a very obvious and straightforward use case, and I'm sure it will be the most common way to use this feature at first. However, I'm confident that our community will come up with some more creative and more dynamic applications.

      Here are some ideas to get you started:

      • Relate the billing metrics to business metrics such as customer count, customer acquisition cost, or advertising spending (all of which you could also store in CloudWatch, as custom metrics) and use them to track the relationship between customer activity and resource consumption. You could (and probably should) know exactly how much you are spending on cloud resources per customer per month.
      • Update your alerts dynamically when you change configurations to add or remove cloud resources. You can use the alerts to make sure that a regression or a new feature hasn't adversely affected your operational costs.
      • Establish and monitor ratios between service costs. You can establish a baseline set of costs, and set alarms on the total charges and on the individual services. Perhaps you know that your processing (EC2) cost is generally 1.5x your database (RDS) cost, which in turn is roughly equal to your storage (S3) cost. Once you have established the baselines, you can easily detect changes that could indicate a change in the way that your system is being used (perhaps your newer users are storing, on average, more data than than the original ones).

      Enabling and Setting a Billing Alert
      To get started, visit your AWS Account Activity page and enable monitoring of your AWS charges. Once you've done that, you can set your first billing alert on your total AWS charges. Minutes later (as soon as the data starts to flow in to CloudWatch) you'll be able to set alerts for charges related to any of the AWS products that you use.

      We've streamlined the process to make setting up billing alerts as easy and quick as possible. You don't need to be familiar with CloudWatch alarms; juts fill out this simple form, which you can access from the Account Activity Page:


      (click for full-sized image)

      You'll receive a subscription notification email from Amazon SNS; be sure to confirm it by clicking the included link to make sure you receive your alerts. You can then access your alarms from the Account Activity page or the CloudWatch dashboard in the AWS Management Console.

      Going Further
      If you have already used CloudWatch, you are probably already thinking about some even more advanced ways to use this new information. Here are a few ideas to get you started:

      • Publish the alerts to an SNS queue, and use them to recalculate your business metrics, possibly altering your Auto Scaling parameters as a result. You'd probably use the CloudWatch APIs to retrieve the billing estimates and to set new alarms.
      • Use two separate AWS accounts to run two separate versions of your application, with dynamic A/B testing based on cost and ROI.

      I'm sure that your ideas are even better than mine. Feel free to post them, or (better yet), implement them!


      <Return to section navigation list>

      by Roger Jennings (--rj) (noreply@blogger.com) at May 13, 2012 07:36 AM