I’ve recently run into some problems with the CookieContainer. Either I’m doing something seriously wrong or there is some kind of bug w/ the CookieContainer object. It doesn’t seem to update the cookie collection with certain Set-Cookie headers.
This might be a lengthy post and I appologize, but I want to be as thurough as possible so I’m going to list my HTTP sniffing logs as well as my actual implementation code.
public bool SendRequest(HttpWebRequest request, IDictionary<string, string> data, int retries)
{
// copy request in case request instance already failed
HttpWebRequest newRequest = (HttpWebRequest)HttpWebRequest.Create(request.RequestUri);
newRequest.Method = request.Method;
// if POST data was provided, write it to the stream
if (data != null && data.Count != 0)
{
StreamWriter writer = new StreamWriter(newRequest.GetRequestStream());
writer.Write(createPostString(data));
writer.Close();
}
// set request with global cookie container
newRequest.CookieContainer = this.cookieJar;
try
{
using (HttpWebResponse resp = (HttpWebResponse)newRequest.GetResponse())
{
//CookieCollection newCooks = getCookies(resp.Headers);
//updateCookies(newCooks);
this.cookieJar = newRequest.CookieContainer;
this.Html = getResponseString(resp);
/* remainder snipped */
So there is the code, here are two request to responses I sniffed in Fiddler:
Request 1
POST /login/ HTTP/1.1
Host: www.site.com
Content-Length: 47
Expect: 100-continue
Connection: Keep-Alive
Response 1
HTTP/1.1 200 OK
Date: Wed, 02 Dec 2009 17:03:35 GMT
Server: Apache
Set-Cookie: tcc=one; path=/
Set-Cookie: cust_id=2702585226; domain=.site.com; path=/; expires=Mon, 01-Jan-2011 00:00:00 GMT
Set-Cookie: cust_session=12%2F2%2F2009%20%2012%3A3%3A35; domain=.site.com; path=/; expires=Wed 2-Dec-2009 17:33:35
Set-Cookie: refer_id_persistent=0000; domain=.site.com; path=/; expires=Fri 2-Dec-2011 17:3:35
Set-Cookie: refer_id=0000; domain=.site.com; path=/
Set-Cookie: private_browsing_mode=off; domain=.site.com; path=/; expires=Fri, 01-Jan-2010 17:03:35 GMT
Set-Cookie: member_session=UmFuZG9tSVYL%5BS%5D%5BP%5DfhH77bYaVoS9j9Yd8ySRkyHHz%5BS%5Dk0S8MVsQ6AyraNlcdcCRC0RkB%5BP%5DfBYVM4vn6JQ3HlJxT3GlJi1RZiMGQaITg7HN9dpu9oRbZgMjhJlXXa%5BP%5D7pFSjqDIZWRr3LAfnhh3btv4E3rvVH42CeOP%5BS%5Dx6kDyvrokQEHyIHPGi7zswZbuHrUdx2XKEKKJzw1unDWfw0LZWjoehAs0QgSOz6Nzp8P4Hp8hqrULdIMch6acPT%5BS%5DbKV8zwugBIcjr5dI3rVR%5BP%5Dv42rsTtQB7dyb%5BP%5DRKb8Y83cGqhHM33hP%5BP%5DUtmbDC1PPfr%5BS%5DPC23lAO%5BS%5DmQ3mOy9x4pgQSOfp40XSfzgVg3EavITaxHBeI5nO3%5BP%5D%5BS%5D2rSDthDfuEm4sT9i6UF3sYd1vlOL0IC9ZsVatV1yhhpQ%5BE%5D%5BE%5D; domain=.site.com; path=/; expires=Fri, 01-Jan-2010 17:03:35 GMT
Connection: close
Transfer-Encoding: chunked
Content-Type: text/html; charset=UTF-8
Request 2
GET /test?search=jjkjf HTTP/1.1
Host: www.site.com
Cookie: tcc=one; cust_id=2702585226; private_browsing_mode=off; member_session=UmFuZG9tSVYL%5BS%5D%5BP%5DfhH77bYaVoS9j9Yd8ySRkyHHz%5BS%5Dk0S8MVsQ6AyraNlcdcCRC0RkB%5BP%5DfBYVM4vn6JQ3HlJxT3GlJi1RZiMGQaITg7HN9dpu9oRbZgMjhJlXXa%5BP%5D7pFSjqDIZWRr3LAfnhh3btv4E3rvVH42CeOP%5BS%5Dx6kDyvrokQEHyIHPGi7zswZbuHrUdx2XKEKKJzw1unDWfw0LZWjoehAs0QgSOz6Nzp8P4Hp8hqrULdIMch6acPT%5BS%5DbKV8zwugBIcjr5dI3rVR%5BP%5Dv42rsTtQB7dyb%5BP%5DRKb8Y83cGqhHM33hP%5BP%5DUtmbDC1PPfr%5BS%5DPC23lAO%5BS%5DmQ3mOy9x4pgQSOfp40XSfzgVg3EavITaxHBeI5nO3%5BP%5D%5BS%5D2rSDthDfuEm4sT9i6UF3sYd1vlOL0IC9ZsVatV1yhhpQ%5BE%5D%5BE%5D
So as you can see, the CookieContainer (this.cookieJar) which is used for every request is not picking up the Set-Cookie header for refer_id, cust_session, refer_id_persistent. However it does pick up cust_id, private_browsing_mode, tcc, and member_session… Any ideas why this might be?
Just wanted to update this post in case someone else came across this. Issue is that .NET complies with the RFC specification for cookie tags, but not all sites do. So, ultimately, the issue is not Microsoft, or .NET for the matter. (Although, IE, manages the cookies fine so it would be better to rewrite their .NET cookie parsing methods using the same parsing methods) The issue is the sites that do not follow RFC specifications.
Nonetheless, an issue I’ve often encountered is that sites will use commas in the expiration dates in their cookies. .NET interprets these as separators between different cookie fields and strips the ending and everything there after off of the cookie.
RFC spec: “Cookie:, followed by a comma-separated list of one or more cookies.” An easy solution to this problem would be for the web server to enclose values with commas in quotation marks, per the RFC document. However, there is no RFC police, so we can only hope that people follow the rules.