Sunday, February 17, 2013

The Twitter API Management Model

The objective of this blog post is to explore in detail the patterns and practices Twitter has used in it's API management.

Twitter comes with a comprehensive set of REST APIs to let client apps talk to Twitter.

Let's take few examples...

If you use following with cUrl - it returns the 20 most recent statuses, including retweets if they exist, from non-protected users. The public timeline is cached for 60 seconds. Requesting more frequently than that will not return any more data, and will count against your rate limit usage.

curl https://api.twitter.com/1/statuses/public_timeline.json

The example above is an open API - which requires no authentication from the client who accesses it. But keep in mind... it has a throttling policy associated with the API. That is the rate limit. For example the throttling policy associated with the ..statuses/public_timeline.json API could say, only allow maximum 20 API calls from the same IP address.. like wise.. so.. this policy is a global policy for this API.

1. Twitter has open APIs - where anonymous users can access.
2. Twitter has globally defined policies per API.

Let's take another sample API - statuses/retweeted_by_user - returns the 20 most recent retweets posted by the specified user - given that the user's timeline is not protected. This is also another open API.

But, what if I want to post to my twitter account? I could use the API statuses/update. This updates the authenticating user's status and this is not an open API. Only the authenticated users can access this.

How do we authenticate our selves to access the Twitter API?

Twitter supported two methods. One way is to use BasicAuth over HTTPS and the other way is OAuth 1.0a.

BasicAuth support was removed recently and now the only remaining way is with OAuth 1.0a. As of this writing Twitter doesn't support OAuth 2.0.

Why I need to use the APIs exposed by Twitter - is that I have some external applications that do want to talk to Twitter and these applications use the Twitter APIs for communication.

If I am the application developer - following are the steps I need to follow to build my application to access protected APIs from Twitter.

First the application developer needs to login to Twitter and creates an Application.

Here, the Application is an abstraction for a set of protected APIs Twitter exposes outside.

Each Application you create, needs to define the level of access it needs to those underling APIs. There are three values to pick from.

- Read only
- Read and Write
- Read, Write and Access direct messages

Let's see what these values mean...

If you pick 'Read only' - that means a user who is going to use your Application needs to give it the permission to read. In other words - the user will be giving access to invoke the APIs defined here which starts with GET, against his Twitter account. The only exception is Direct Messages APIs - with Read only your Application won't have access to a given user's Direct Messages - even GETs.

Even you who develop the application - the above is valid for you too as well. If you want to give your application, access your Twitter account - there also you should give the application the required rights.

If you pick Read and Write - that means a user who is going to use your application needs to give it the permission to read and write. In other words - the user will be giving access to invoke the APIs defined here which starts with GET or POST, against his Twitter account. The only exception is Direct Messages APIs - with Read and Write, your application won't have access to a given user's Direct Messages - even GETs or POSTs.

3. Twitter has an Application concept that groups APIs together.
4. Each API declares the actions those do support. GET or POST
5. Each Application has a required access level for it to function[Read only, Read and Write, Read Write and Direct Messages]

Now lets dig in to the run-time aspects of this. I am going to skip OAuth related details here purposely for clarity.

For our Application to access the Twitter APIs - it needs a key. Let's name it as API_KEY [if you know OAuth - this is equivalent to the access_token]. Say I want to use this Application. First I need to go to Twitter and need to generate an API_KEY to access this Application. Although there are multiple APIs wrapped in the Application - I only need a single API_KEY.

6. API_KEY is per user per Application[a collection of APIs].

When I generate the API_KEY - I can specify what level of access I am going to give to that API_KEY - Read Only, Read & Write or Read, Write & Direct Message. Based on the access level I can generate my API_KEY.

7. API_KEY carries permissions to the underlying APIs.

Now I give my API_KEY to the Application. Say it tries to POST to my Twitter time-line. That request also should include the API_KEY. Now once the Twitter gets the request - looking at the API_KEY it will identify the Application is trying to POST to 'my' time-line. Also it will check whether the API_KEY has Read & Write permissions - if so it will let the Application post to my Twitter time-line.

If the Application tries to read my Direct Messages using the same API_KEY I gave it - then Twitter will detect that the API_KEY doesn't have Read, Write & Direct Message permission and the request will fail.

Even in the above case, if the Application tries to post to the Application Developer's Twitter account - there also it needs an API_KEY - from the Application Developer, which he can get from Twitter.

Once user grants access to an Application by it's API_KEY - during the entire life time of the key, the application can access his account. But, if the user wants to revoke the key, Twitter provides a way to do that as well. Basically when you go here, it displays all the Applications you have given permissions to access your Twitter account - if you want, you can revoke access from there.

8. Twitter lets users revoke API_KEYs

Also another interesting thing is how Twitter does API versioning. If you carefully look at the URLs, you will notice that the version number is included in the URL it self - https://api.twitter.com/1/statuses/public_timeline.json. But it does not let Application developers to pick, which versions of the APIs they want to use.

9. Twitter tracks API versions in runtime.
10.Twitter does not let Application developers to pick API versions at the time he designs it.

Twitter also has a way of monitoring the status of the API. Following shows a screenshot of it.

11. Twitter does API monitoring in runtime.

Why OAuth it self is not an authentication framework ?

Let's straight a way start with definitions to avoid any confusions.

Authentication is the act of confirming the truth of an attribute of a datum or entity. If I say, I am Prabath - I need to prove that. I can prove that with something I know, something I have or with something I am. Once I proved my self who I claim I am, then the system can trust me. Sometimes systems do not just want to identify me by just by name. By name could help to identify me uniquely - but how about other attributes of me. Before you get through the boarder control - you need to identify your self - by name - by picture - and also by finger prints and eye retina. Those are validated real-time against the data from the VISA office which issued the VISA for you. That check will make sure its the same person who claimed to have the VISA enters in to the country. That is proving your identity. Proving your identity is authentication.

Authorization is about what you can do. Your capabilities. You could prove your identity at the boarder control by name - by picture - and also by finger prints and eye retina - but it's your VISA that decides what you can do. To enter in to the country you need to have a valid VISA that has not expired. A valid VISA is not a part of your identity - but a part of what you can do. Also what you can do inside the country depends on the VISA type. What you do with a B1 or B2 differs from what you can do with an L1 or L2. That is authorization.

OAuth 2.0 is about authorization. Not about authentication.

OAuth 2.0 is about what you can do ( or on behalf of another user) - not about who you are. So - you cannot use OAuth 2.0 it self for authentication.

I hear what you say. We use OAuth 2.0 based Facebook login to log in to plenty of different web applications. So am I lying ?

Let's pick Facebook it self for an example. I got following image from Facebook login developer documentation. It highlights the OAuth flow with authorization code grant type.

User goes to a Web Application. Clicks on the Login button. Gets redirected to the Facebook for authentication and pass the "code" to the client Web Application. Now the Web Application exchanges the code to an access token. Facebook will only issue an access token for a valid code. And - it also will only issue a code after the user been authenticated. So, the Web Application having access to a valid access token means - and it only means - that particular user is from Facebook. That's it. It's like carrying out a valid, stamped VISA to the boarder control with out having any identification information about the user on it. Looking at the access token - the application cannot say who the user is - it can only say where the user comes from. So it does not help the Web Application to identify the end user uniquely. When the same user logs in next time he could bring a different access token - so the Web Application cannot use the access token as a way of identification. That is why OAuth it self is not an authentication framework.




















Still you think I am lying? Of course you do.

When you login via Facebook - your application also gets certain claims about you. Like your first name, last name and email address - and also your Facebook Id. How does this happen if OAuth does not bring any identity attributes ?

Once the Web Application gets the access token from the Facebook, to get user attributes - the Web Application can call following API with the access token it got.

https://graph.facebook.com/me?access_token=[access_token]

This will get the identity information of the user who authorized the access token.

Now - this helps to identify the user - or to authenticate the user. But, keep in mind that the last step is out side the scope of OAuth and that functionality is provided via Facebook's Graph API.

Not just through Facebook's Graph API - even through SCIM we can identify the user. But with the current specification there seems to be a limitation in doing that in a standard manner.

With the current specification to get the details of a specific user we use the following.

 GET /Users/2819c223-7f76-453a-919d-413861904646
 Host: example.com
 Accept: application/json Authorization: Bearer h480djs93hd8

But, just after the OAuth flow - we only have the access token at the client side. No concrete information about the end user. So we cannot use the above.

By having a pre-defined userid say "me" [just like in Facebook Graph API] - cab be made to provide back the logged in user's info.

 GET /Users/me
 Host: example.com
 Accept: application/json
 Authorization: Bearer h480djs93hd8

In the above case - the information about the user corresponding to the provided Bearer token will be returned back. So, based on that, client can identify the end user with it's attributes.

Apart from the above two approaches - OpenID Connect is the best one built on top of OAuth 2.0 to identify the end user.

With OpenID Connect in addition to the OAuth access token - the client or the Web Application also gets an ID token. The ID Token is a security token that contains claims/identity information about the authentication event and other requested claims from the client. The ID Token is represented as a JWT - and it can be used to prove the identity of the end user.

SAML 2.0 Bearer Assertion Profile for OAuth 2.0

SAML 2.0 Bearer Assertion Profile which is built on top of OAuth 2.0 Assertion Profile defines the use of a SAML 2.0 Bearer Assertion as a means for requesting an OAuth 2.0 access token as well as for use as a means of client authentication.

The OAuth 2.0 Assertion Profile is an abstract extension to OAuth 2.0 that provides a general framework for the use of Assertions as client credentials and/or authorization grants with OAuth 2.0.

This specification profiles the OAuth 2.0 Assertion Profile...

1.  To define an extension grant type that uses a SAML 2.0 Bearer Assertion to request an OAuth 2.0 access token when a client wishes to utilize an existing trust relationship, expressed through the semantics of (and digital signature calculated over) the SAML Assertion, without a direct user approval step at the authorization server.

2.  To define how a SAML Assertion can be used as a client authentication mechanism. The use of an Assertion for client authentication is orthogonal to and separable from using an Assertion as an authorization grant. They can be used either in combination or separately.

This specification further defies the format and processing rules for the SAML Assertion are intentionally similar, though not identical, to those in the Web Browser SSO Profile defined in SAML Profiles.

Now let's have a look at an end to end scenario which explains how to use SAML 2.0 Bearer Assertion as an grant type.


1. User been redirected to the SAML2 IdP and user authenticates to the SAML2 IdP.

2. SAML2 IdP issues a SAML2 Token with following characteristics and Client Web App verifies the SAML2 token and authenticates the user.
  • The Assertion's Issuer element will contain a unique identifier for the entity that issued the Assertion - which is the SAML2 IdP in our case.
  • The Assertion will contain Conditions element with an AudienceRestriction element with an Audience element containing a URI reference that identifies the Authorization Server.
  •  The Assertion will contain Conditions element with an AudienceRestriction element with an Audience element containing a URI reference that identifies the Client Web App, which is the SAML2 Service Provider and Client Web App must verify element.
  •  The Assertion MUST contain a Subject element. The subject MAY identify the resource owner for whom the access token is being requested. When using an Assertion as an authorization grant, the Subject SHOULD identify an authorized accessor for whom the access token is being requested (typically the resource owner, or an authorized delegate).
  • The Assertion MUST have an expiry that limits the time window during which it can be used. The expiry can be expressed either as the NotOnOrAfter attribute of the Conditions element or as the NotOnOrAfter attribute of a suitable SubjectConfirmationData element.
  •  The element must contain at least on element that allows the authorization serve to confirm it as a Bearer Assertion.  Such a element MUST have a Method attribute with a value of "urn:oasis:names:tc:SAML:2.0:cm:bearer".
  • The SubjectConfirmation element MUST contain a SubjectConfirmationData element, unless the Assertion has a suitable NotOnOrAfter attribute on the Conditions element, in which case the SubjectConfirmationData element MAY be omitted. When present, the SubjectConfirmationData element MUST have a Recipient attribute with a value indicating the token endpoint URL of the authorization server (or an acceptable alias). The  SubjectConfirmationData element MUST have a NotOnOrAfter attribute that limits the window during which the Assertion can be confirmed.  The SubjectConfirmationData element MAY also contain an Address attribute limiting the client address from which the Assertion can be delivered.  Verification of the Address is at the discretion of the authorization server.
  • The Assertion MUST have an expiry that limits the time window during which it can be used. The expiry can be expressed either as the NotOnOrAfter attribute of the Conditions element or as the NotOnOrAfter attribute of a suitable SubjectConfirmationData element.
  •  If the Assertion issuer authenticated the subject, the Assertion SHOULD contain a single  AuthnStatement representing that authentication event.
  •  If the Assertion was issued with the intention that the presenter act autonomously on behalf of the subject, an AuthnStatement SHOULD NOT be included.  The presenter SHOULD be identified in the NameID or similar element, the element, or by other available means.
  • Other statements, in particular elements, MAY be included in the Assertion.
3. Now Client Web App wants to access a RESTful service which secured with OAuth 2.0 and it has to get an access token on behalf the logged in user first. Now using the SAML2 grant type, Client Web App Requests an access token from the OAuth 2.0 Authorization Server.

     POST /token.oauth2 HTTP/1.1
     Host: as.example.com
     Content-Type: application/x-www-form-urlencoded

     grant_type=urn%3Aietf%3Aparams%3Aoauth%3Agrant-type%3Asaml2-bearer&
     assertion=PHNhbWxwOl...[omitted for brevity]...ZT4


4. Authorization Server validates the SAML token assertion and issues an access token back. Authorization Server does the SAML token verification as mentioned below.
  • The Assertion will contain Conditions element with an AudienceRestriction element with an Audience element containing a URI reference that identifies the Authorization Server and Authorization Server must verify the element.
  •  The SubjectConfirmation element MUST contain a element, unless the Assertion has a suitable NotOnOrAfter attribute on the Conditions element, in which case the SubjectConfirmationData element MAY be omitted. When present, the  SubjectConfirmationData element MUST have a Recipient attribute with a value indicating the token endpoint URL of the authorization server (or an acceptable alias).  The authorization server MUST verify that the value of the Recipient attribute matches the token endpoint URL (or an acceptable alias) to which the Assertion was delivered.  The SubjectConfirmationData element MUST have a NotOnOrAfter attribute that limits the window during which the Assertion can be confirmed.  The SubjectConfirmationData element MAY also contain an Address attribute limiting the client address from which the Assertion can be delivered.  Verification of the Address is at the discretion of the authorization server.
  • The authorization server MUST verify that the NotOnOrAfter instant has not passed, subject to allowable clock skew between systems. An invalid NotOnOrAfter instant on the Conditions element invalidates the entire Assertion.  An invalid NotOnOrAfter instant on a SubjectConfirmationData element only invalidates the individual SubjectConfirmation .  
  • The authorization server MAY reject Assertions with a NotOnOrAfter instant that is unreasonably far in the future.  The authorization server MAY ensure that Bearer Assertions are not replayed, by maintaining the set of used ID values for the length of time for which the Assertion would be considered valid based on the applicable NotOnOrAfter instant.
 5. Now Client Web App can use the access token to access the secured service.

OAuth 2.0 Bearer Token Profile Vs MAC Token Profile

Almost all the implementation I see today are based on OAuth 2.0 Bearer Token Profile. Of course its an RFC proposed standard today.

OAuth 2.0 Bearer Token profile brings a simplified scheme for authentication. This specification describes how to use bearer tokens in HTTP requests to access OAuth 2.0 protected resources. Any party in possession of a bearer token (a "bearer") can use it to get access to the associated resources (without demonstrating possession of a cryptographic key). To prevent misuse, bearer tokens need to be protected from disclosure in storage and in transport.

Before dig in to the OAuth 2.0 MAC profile lets have quick high-level overview of OAuth 2.0 message flow.

OAuth 2.0 has mainly three phases.

1. Requesting an Authorization Grant.
2. Exchanging the Authorization Grant for an Access Token.
3. Access the resources with the Access Token.

Where does the token type come in to action ? OAuth 2.0 core specification does not mandate any token type. At the same time at any point token requester - client - cannot decide which token type it needs. It's purely up to the Authorization Server to decide which token type to be returned in the Access Token response. So, the token type comes in to action in phase-2 when Authorization Server returning back the OAuth 2.0 Access Token.

The access token type provides the client with the information required to successfully utilize the access token to make a protected resource request (along with type-specific attributes). The client must not use an access token if it does not understand the token type.

Each access token type definition specifies the additional attributes (if any) sent to the client together with the "access_token" response parameter. It also defines the HTTP authentication method used to include the access token when making a protected resource request.

For example following is what you get for Access Token response irrespective of which grant type you use.

 HTTP/1.1 200 OK
     Content-Type: application/json;charset=UTF-8
     Cache-Control: no-store
     Pragma: no-cache

     {
       "access_token":"mF_9.B5f-4.1JqM",
       "token_type":"Bearer",
       "expires_in":3600,
       "refresh_token":"tGzv3JOkF0XG5Qx2TlKWIA"
      }

The above is for Bearer - following is for MAC.

HTTP/1.1 200 OK
     Content-Type: application/json
     Cache-Control: no-store

     {
       "access_token":"SlAV32hkKG",
       "token_type":"mac",
       "expires_in":3600,
       "refresh_token":"8xLOxBtZp8",
       "mac_key":"adijq39jdlaska9asud",
       "mac_algorithm":"hmac-sha-256"
     }

Here you can see MAC Access Token response has two additional attributes. mac_key and the mac_algorithm. Let me rephrase this - "Each access token type definition specifies the additional attributes (if any) sent to the client together with the "access_token" response parameter".

This MAC Token Profile defines the HTTP MAC access authentication scheme, providing a method for making authenticated HTTP requests with partial cryptographic verification of the request, covering the HTTP method, request URI, and host. In the above response access_token is the MAC key identifier. Unlike in Bearer, MAC token profile never passes it's top secret over the wire.

The access_token or the MAC key identifier is a string identifying the MAC key used to calculate the request MAC. The string is usually opaque to the client. The server typically assigns a specific scope and lifetime to each set of MAC credentials. The identifier may denote a unique value used to retrieve the authorization information (e.g. from a database), or self-contain the authorization information in a verifiable manner (i.e. a string consisting of some data and a signature).

The mac_key is a shared symmetric secret used as the MAC algorithm key. The server will not reissue a previously issued MAC key and MAC key identifier combination.

Now let's see what happens in phase-3.

Following shows how the Authorization HTTP header looks like when Bearer Token been used.

Authorization: Bearer mF_9.B5f-4.1JqM

This adds very low overhead on client side. It simply needs to pass the exact access_token it got from the Authorization Server in phase-2.

Under MAC token profile, this is how it looks like.

Authorization: MAC id="h480djs93hd8",
                        ts="1336363200",
                        nonce="dj83hs9s",
                        mac="bhCQXTVyfj5cmA9uKkPFx1zeOXM="

This needs bit more attention.

id is the MAC key identifier or the access_token from the phase-2.

ts the request timestamp. The value is  a positive integer set by the client when making each request to the number of seconds elapsed from a fixed point in time (e.g. January 1, 1970 00:00:00 GMT).  This value is unique across all requests with the same timestamp and MAC key identifier combination.

nonce is a unique string generated by the client. The value is unique across all requests with the same timestamp and MAC key identifier combination.

The client uses the MAC algorithm and the MAC key to calculate the request mac.

This is how you derive the normalized string to generate the HMAC.

The normalized request string is a consistent, reproducible concatenation of several of the HTTP request elements into a single string.

By normalizing the request into a reproducible string, the client and server can both calculate the request MAC over the exact same value.

The string is constructed by concatenating together, in order, the following HTTP request elements, each followed by a new line character (%x0A):

1. The timestamp value calculated for the request.

2. The nonce value generated for the request.

3. The HTTP request method in upper case. For example: "HEAD", "GET", "POST", etc.

4. The HTTP request-URI as defined by [RFC2616] section 5.1.2.

5. The hostname included in the HTTP request using the "Host" request header field in lower case.

6. The port as included in the HTTP request using the "Host" request header field. If the header field does not include a port, the default value for the scheme MUST be used (e.g. 80 for HTTP and 443 for HTTPS).

7. The value of the "ext" "Authorization" request header field attribute if one was included in the request (this is optional), otherwise, an empty string.

Each element is followed by a new line character (%x0A) including the last element and even when an element value is an empty string.

Either you use Bearer of MAC - the end user or the resource owner is identified using the access_token. Authorization, throttling, monitoring or any other quality of service operations can be carried out against the access_token irrespective of which token profile you use. 

OAuth 1.0a Bounded Tokens

We had a very interesting discussion today on the $subject.

In fact, one of my colleagues came up with a question - "Why the hell OAuth 1.0 uses two keys to sign messages..? Why Not just the consumer secret ?"

Well.. In fact - the exact reverse of this argument is one key point Eran Hammer highlights in his famous blog post - to emphasize why OAuth 1.0a is better...

"Unbounded tokens - In 1.0, the client has to present two sets of credentials on each protected resource request, the token credentials and the client credentials. In 2.0, the client credentials are no longer used. This means that tokens are no longer bound to any particular client type or instance. This has introduced limits on the usefulness of access tokens as a form of authentication and increased the likelihood of security issues."

Before I answer the question - let me very briefly explain OAuth 1.0a flow..

1. To become an OAuth consumer you need to have a Consumer Key and a Consumer Secret. You can obtain these from the OAuth Service Provider. These two parameters can be stored in a database or in the filesystem.

Consumer Key: A value used by the Consumer to identify itself to the Service Provider.
Consumer Secret: A secret used by the Consumer to establish ownership of the Consumer Key.


2. Using the Consumer Key and Consumer Secret, you need to talk to the OAuth Service Provider and get a Unauthorized Request Token.

Request to get the Unauthorized Request Token will include following parameters.
  • oauth_consumer_key
  • oauth_signature_method
  • oauth_signature
  • oauth_timestamp
  • oauth_nonce
  • oauth_version
  • oauth_callback
Here the certain parameters of the request will be signed by the Consumer Secret. Signature validation at the OAuth Service Provider end will confirm the identity of the Consumer.

As the response to the above token, you will get the following.
  • oauth_token
  • oauth_token_secret
  • oauth_callback_confirmed
Here, the oauth_token is per OAuth Consumer per Service Provider. But can only be used once. If an OAuth consumer tries to use twice, OAuth Service Provide should discard the request.

3. Now the OAuth Consumer, has the Request Token - and it needs to exchange this token to an Authorized Request Token. The Request Token needs to be authorized by the end user. So, the Consumer will redirect the end user to the Service Provider with following request parameter.
  • oauth_token
This oauth_token is the Request Token obtained in the previous step. Here there is no signature - and this token does not need to be signed. The token it self proves the identity of the OAuth Consumer.

As the response to the above, OAuth Consumer will get the following response.
  • oauth_token
  • oauth_verifier
Here, the oauth_token is the Authorized Request Token. This token is per User per Consumer per Service Provider.

The oauth_verifier is a verification code tied to the Authorized Request Token. The oauth_verifier and Authorized Request Token both must be provided in exchange for an Access Token. They also both expire together. If the oauth_callback is set to oob in Step 2, the oauth_verifier is not included as a response parameter and is instead presented once the User grants authorization to Consumer Application. The Service Provider will instruct the User to enter the oauth_verifier code in Consumer  Application. The Consumer must ask for this oauth_verifier code to ensure OAuth authorization can proceed. The oauth_verifier is intentionally short so that a User can type it manually.

Both the above two parameters are returned back to the OAuth Consumer via a browser redirect over SSL.

4. Now the OAuth Consumer has the Authorized Request Token. Now it will exchange that to an Access Token. Once again the Access Token is per User per Consumer per Service Provider. Here the communication is not a browser re-direct but a direct communication between the Service Provider and the OAuth Consumer. This step is needed because, in step 3, OAuth Consumer gets the token through the browser.

The request to the access token will include following parameters.
  • oauth_consumer_key
  • oauth_token
  • oauth_signature_method
  • oauth_signature
  • oauth_timestamp
  • oauth_nonce
  • oauth_version
  • oauth_verifier
Here the oauth_token is the Authorized Request Token from step - 3.

The signature here is generated using a combined key - Consumer Secret and Token Secret [from step -2 ] separated by an "&" character. I will explain later why two keys used here for signing.

As the response to the above, the Consumer will get the following...
  • oauth_token
  • oauth_token_secret
All four steps above should happen over TLS/SSL.

5. Now the OAuth Consumer can access the protected resource. The request will look like following.
  • oauth_consumer_key
  • oauth_token
  • oauth_signature_method
  • oauth_signature
  • oauth_timestamp
  • oauth_nonce
  • oauth_version
Here the oauth_token is the Access Token obtained in step-4. The signature is once again generated by a combined key. Consumer Secret and Token Secret [from step -4 ] separated by an "&" character.

The step - 5 does not required to be on TLS/SSL.

Let's get back to the original question...

Why the hell OAuth 1.0 uses two keys to sign messages..? Why Not just the consumer secret ?

During the OAuth flow there are three places where OAuth Consumer has to sign.

1. Request to get an Unauthorized Request Token. Here the request will be signed by the Consumer Secret only. ( Request for Temporary Credentials )

2. Request to get an Access Token. Here the request will be signed by the Consumer Secret and the Token Secret returned back with the Unauthorized Request Token in step - 2. ( Request for Token Credentials )

Why do we need two keys here ? It's for tighten security. What exactly that means..?

Have a look at step - 3. There the Consumer only sends the oauth_token from step - 2 to get User authorization. And as the response to this the Consumer will get the Authorized Request Token (oauth_token). Now in step - 4 Consumer uses this Authorized Request Token to get the Access Token. Here, the consumer needs to prove that - its the same consumer who got the unauthorized request token, now requesting for the Access Token. To prove the ownership of the Request Token, now the Consumer will sign the Access Token request with both the Consumer Secret and the Token Secret returned back with the Unauthorized Request Token.

3. Request to the protected resource. Here the request to the protected resource will be signed by the Consumer Secret and the Token Secret returned back with the Access Token in step - 4. ( Request for Resources )

Once again we need two keys here for tighten security. If OAuth only relies on the consumer secret for the signature, the person who steals can capture the access token [as it goes to the protected resource on HTTP] and signs the request with the stolen secret key. Then all the users of that OAuth Consumer will be compromised. Since we use two keys here, even though someone steals the consumer secret it cannot harm any of the end users - because it cannot get the Token Secret which is sent to the Consumer over SSL in step - 4. Also - you should never store Token Secrets at the Consumer end - that will reduce any risks of stealing Token Secrets. But, you need to store your Consumer secret as explained previously.

In any case if you have to store the Token Secret, then you need store it encrypted - and also make sure you store it in a different database from where you store your consumer secret. That is the best practice we follow when we store hashed passwords and the corresponding salt values.

Apart from the above benefit - the Token Secret can also act as a salt value. Since the request to the protected resource goes over the wire, a hacker can carry out a known plain text attack to guess the Key used to sign the message. If we only use the Consumer Secret, then all the users of the OAuth Consumer will be compromised. But, if we use the Consumer Secret with the Token Secret, then only that particular user will be compromised - not even the Consumer Secret.

Finally, OAuth presents a Bounded Token protocol over HTTP for accessing the protected resource. That means - each request to the protected resource the Conumer should prove that he owns the token, as well as - he is the one who he claims to be. In the bounded token model, an issued token can only be used by the exact person who the token was issued to. [Compare this with Bearer token in OAuth 2.0]. To prove the identity of the Consumer it self, it can sign the message it self with consumer_secret. To prove that he also owns the token, he needs to sign the message token_secret too. Anyone can know the Access Token, but it's only the original Consumer knows the token_secret. So the key to sign is derived from combining both the keys.

Proposal : Resource Owner Initiated Delegation OAuth 2.0 Grant Type

Please see my previous blog post for more context on the $subject. After that, I was pointed to have a look at UMA. I've been following UMA for sometime and it once again builds a very useful/realistic set of use cases on top of OAuth 2.0. But still - it too, Client initiated or Requester initiated.

I believe my use case can be addressed in OAuth 2.0 it self with a new grant type or in UMA.

Let me break down this in to four phases.

1. Client registers with the Resource Server via Authorization Server
2. Resource Owner grants access to a selected Client.
3. Client gets access_token from the Authorization Server.
4. Client accesses the protected resource on behalf of the Resource Owner.

Let's dig deep.

1. Client registers with the Resource Server via Authorization Server.

This is the a normal OAuth request with any of the defined grant types with the scope resource_owner_initiated.

At the time of Client registration on the Resource Server, Client will be redirected to his Authorization Server.

In return Resource Server gets an access_token along with the client_id. Now Resource Server has a set of registered Clients with corresponding access_tokens.

2. Resource Owner grants access to a selected Client.

Resource Owner logs in to the Resource Server and he can see all registered Clients with the Resource Server.

Resource Owner picks a Client.

Let's say the client_id we got was Foo from Step-1, along with the access_token.

Now,  the Resource Server will redirect the Resource Owner to the Authorization Server.

GET /authorize? grant_type=resource_owner_initiated&client_id=Foo&scope=Bar
Host: server.example.com

If the above is successful then the Authorization Server will send the following to the Resource Server.

HTTP/1.1 200 OK

After this step completed, Authorization Server will create an access_token for the Client - to access the a Protected Resource on behalf of the Resource Owner .

3. Client gets access_token from the Authorization Server.

Client should be able to get all the resource owner delegated access_tokens for him or a single one by specifying the resource owner Id.

To get all the delegated access_tokens - from the Client to the Authorization Server.

POST /token HTTP/1.1
Host: server.example.com
Authorization: Basic XfhgFgdsdsdkhewe
Content-Type: application/x-www-form-urlencoded

grant_type=resource_owner_initiated

Response from the Authorization Server to the Client.

HTTP/1.1 200 OKContent-Type: application/json;charset=UTF-8
Cache-Control: no-store
Pragma: no-cache

{
   "delegated_tokens" : [
      {  "access_token":"2YotnFZFEjr1zCsicMWpAA",
          "token_type":"example",
          "expires_in":3600,
          "refresh_token":"tGzv3JOkF0XG5Qx2TlKWIA",
          "resource_owner_id":"Bob"
      },
     {  "access_token":"2YotnFZFEjr1zCsicMWpAA",
          "token_type":"example",
          "expires_in":3600,
          "refresh_token":"tGzv3JOkF0XG5Qx2TlKWIA",
          "resource_owner_id":"Tom"   
      } 
    ]
}

To get a single delegated access_token - from the Client to the Authorization Server.

POST /token HTTP/1.1
Host: server.example.com
Authorization: Basic XfhgFgdsdsdkhewe
Content-Type: application/x-www-form-urlencoded

grant_type=resource_owner_initiated&resource_owner_id=Bob

Response from the Authorization Server to the Client.

HTTP/1.1 200 OKContent-Type: application/json;charset=UTF-8
Cache-Control: no-store
Pragma: no-cache

{
          "access_token":"2YotnFZFEjr1zCsicMWpAA",
          "token_type":"example",
          "expires_in":3600,
          "refresh_token":"tGzv3JOkF0XG5Qx2TlKWIA",
          "resource_owner_id":"Bob"
}

4. Client accesses the protected resource on behalf of the Resource Owner.

This is normal OAuth flow. Client can now access the Protected Resource with the access_token.

2-legged OAuth with OAuth 1.0 and 2.0

OAuth 1.0 emerged from the large social providers like Facebook, Yahoo!, AOL, and Google. Each had developed its own alternative to the password anti-pattern. OAuth 1.0 reflected their agreement on a single community standard.

In 2009, an attack on OAuth 1.0 was identified which relied on an attacker initiating the OAuth authorization sequence, and then convincing a victim to finish the sequence – a result of which would be the attacker’s account at an (honest) client being assigned permissions to the victim’s resources at an (honest) RS.

OAuth 1.0a was the revised specification version that mitigated the attack.

In 2009, recognizing the value of more formalized standardization, that community contributed OAuth 1.0 to the IETF. It was within the IETF Working Group that the original OAuth 1.0 was reworked and clarified to become the Informative RFC 5849.



In 2010, Microsoft, Yahoo!, and Google created the Web Resource Authentication Protocol (WRAP), which was soon submitted into the IETF WG as input for OAuth 2.0. WRAP proposed significant reworking of the OAuth 1.0a model.

Among the changes were the deprecation of message signatures in favor of SSL, and a formal separation between the roles of ‘token issuance’ and ‘token reliance.’

Development of OAuth 2.0 in the IETF consequently reflects the input of both OAuth 1.0, OAuth 1.0a, and the WRAP proposal. It is fair to say that the very different assumptions about what are appropriate security protections between OAuth 1.0a and WRAP have created tensions within the IETG OAuth WG.

While OAuth 2.0 initially reflected more of the WRAP input, lately (i.e. fall 2010) there has been a swing in group consensus that the signatures of OAuth 1.0a that were deprecated by WRAP are appropriate and desirable in some situations. Consequently, signatures are to be added back as an optional security mechanism.

While many deployments of OAuth 1.0a survive, more and more OAuth 2.0 deployments are appearing – necessarily against a non-final version of the spec. For instance, Facebook, Salesforce, and Microsoft Azure ACS all use draft 10 of OAuth 2.0.

[The above paragraphs are direct extracts from the white-paper published by Ping Identity on OAuth]

OAuth provides a method for clients to access server resources on behalf of a resource owner (such as a different client or an end-user). It also provides a process for end-users to authorize third-party access to their server resources without sharing their credentials (typically, a username and password pair), using user-agent redirections.

In the traditional client-server authentication model, the client requests an access restricted resource (protected resource) on the server by authenticating with the server using the resource owner's credentials. In order to provide third-party applications access to restricted resources, the resource owner shares its credentials with the third-party. This creates several problems and limitations.

1. Third-party applications are required to store the resource owner's credentials for future use, typically a password in clear-text.

2. Servers are required to support password authentication, despite the security weaknesses created by passwords.

3. Third-party applications gain overly broad access to the resource owner's protected resources, leaving resource owners without any ability to restrict duration or access to a limited subset of resources.

4. Resource owners cannot revoke access to an individual third-party without revoking access to all third-parties, and must do so by changing their password.

5. Compromise of any third-party application results in compromise of the end-user's password and all of the data protected by that password.

OAuth addresses these issues by introducing an authorization layer and separating the role of the client from that of the resource owner.

The protocol centers on a three-legged scenario, delegating User access to a Consumer for resources held by a Service Provider. In many cases, a two-legged scenario is needed, in which the Consumer is acting on behalf of itself, without a direct or any User involvement.

OAuth was created to solve the problem of sharing two-legged credentials in three-legged situations. However, within the OAuth context, Consumers might still need to communicate with the Service Provider using requests that are Consumer-specific. Since the Consumers already established a Consumer Key and Consumer Secret, there is value in being able to use them for requests where the Consumer identity is being verified.

This specification defines how 2-legged OAuth works with OAuth 1.0. But it never became an IETF RFC.

With OAuth 1.0 - 2-legged OAuth includes two parties. The consumer and the service provider. Basically in this case consumer also becomes the resource owner. Consumer first needs to register a consumer_key and consumer_secret with the service provider. To access a Protected Resource, the Consumer sends an HTTP(S) request to the Service Provider's resource endpoint URI. The request MUST be signed as defined in OAuth Core 1.0 section 9 with an empty Token Secret.

All the requests to the Protected Resources MUST be signed by the Consumer and verified by the Service Provider. The purpose of signing requests is to prevent unauthorized parties from using the Consumer Key when making Protected Resources requests. The signature process encodes the Consumer Secret into a verifiable value which is included with the request.

OAuth does not mandate a particular signature method, as each implementation can have its own unique requirements. The protocol defines three signature methods: HMAC-SHA1, RSA-SHA1, and PLAINTEXT, but Service Providers are free to implement and document their own methods.

The Consumer declares a signature method in the oauth_signature_method parameter, generates a signature, and stores it in the oauth_signature parameter. The Service Provider verifies the signature as specified in each method. When verifying a Consumer signature, the Service Provider SHOULD check the request nonce to ensure it has not been used in a previous Consumer request.

The signature process MUST NOT change the request parameter names or values, with the exception of the oauth_signature parameter.

2-legged OAuth with OAuth 1.0 - the request to the protected resource will look like following.
http://provider.example.net/profile
            Authorization: OAuth realm="http://provider.example.net/",
            oauth_consumer_key="dpf43f3p2l4k3l03",
            oauth_signature_method="HMAC-SHA1",
            oauth_signature="IxyYZfG2BaKh8JyEGuHCOin%2F4bA%3D",
            oauth_timestamp="1191242096",
            oauth_token="",
            oauth_nonce="kllo9940pd9333jh",
            oauth_version="1.0"

This blog post explains with an example, how to use 2-legged OAuth with OAuth 1.0 to secure RESTful service.

Now, let's look at OAuth 2.0 - still at the stage of a draft specification. This doesn't talk about 2-legged OAuth. But - it can be implemented with different approaches suggested in OAuth 2.0.

Have a look at this & this - both talk about how to implement 2-legged OAuth with OAuth 2.0 - and those discussions are from the OAuth 2.0 IETF work group.

OAuth 2.0 defines four roles:

1. resource owner : An entity capable of granting access to a protected resource (e.g. end-user).

2. resource server : The server hosting the protected resources, capable of accepting and responding to protected resource requests using access tokens.

3. client : An application making protected resource requests on behalf of the resource owner and with its authorization.

4. authorization server : The server issuing access tokens to the client after successfully authenticating the resource owner and obtaining authorization.

In case of 2-legged OAuth, client becomes the resource owner.

We can at very high-level break the full OAuth flow in to two parts.

1. Get a token from the authorization server
2. Use the token to access the resource server

Let's see how the above two steps work under 2-legged OAuth.

OAuth 2.0 defines a concept called - "authorization grant" - which is a credential representing the resource owner's authorization (to access its protected resources) used by the client to obtain an access token. This specification defines four grant types.

1. authorization code
2. implicit
3. resource owner password credentials
4. client credentials

"Client Credentials" is the grant type which goes closely with 2-legged OAuth.

With "Client Credentials" grant type, the client can request an access token using only its client credentials (or other supported means of authentication) when the client is requesting access to the protected resources under its control.

Once the client makes this request to the authorization server - it will return back an access token to access the protected resource.

The access token returned back to the client could be either of type bearer of MAC.

The "mac" token type defined in ietf-oauth-v2-http-mac is utilized by issuing a MAC key together with the access token which is used to sign certain components of the HTTP requests by the client when accessing the protected resource.

The MAC scheme requires the establishment of a shared symmetric key between the client and the server. This is often accomplished through a manual process such as client registration.

The OAuth 2.0 specification offers two methods for issuing a set of MAC credentials to the client using..

1. OAuth 2.0 in the form of a MAC-type access token, using any supported OAuth grant type. [This is what we discussed above - an access token with 'MAC' type]

2. The HTTP "Set-Cookie" response header field via an extension attribute.

When using MAC type access tokens with 2-legged OAuth - the request to the protected resource will look like following.
GET /resource/1?b=1&a=2 HTTP/1.1
     Host: example.com
     Authorization: MAC id="h480djs93hd8",
                        nonce="264095:dj83hs9s",
                        mac="SLDJd4mg43cjQfElUs3Qub4L6xE="
Bearer type is defined here. It's a security token with the property that any party in possession of the token (a "bearer") can use the token in any way that any other party in possession of it can. Using a bearer token does not require a bearer to prove possession of cryptographic key material (proof-of-possession).

When using Bearer type access tokens with 2-legged OAuth - the request to the protected resource will look like following.
GET /resource HTTP/1.1
   Host: server.example.com
   Authorization: Bearer vF9dft4qmT

Also - the issued access token from the Authorization Server to the client, has an 'scope' attribute. [2-legged OAuth with OAuth 1.O doesn't have this scope attribute as well as access token concept - so resource server has to perform authorization separately based on the resource client going to access]

The client should request access tokens with the minimal scope and lifetime necessary. The authorization server will take the client identity into account when choosing how to honor the requested scope and lifetime, and may issue an access token with a less rights than requested.

When securing APIs with OAuth - this 'scope' attribute can be bound to different APIs. So, the authorization server can decide whether to let the client access this API or not.