Suppose you need to programmatically analyze some web pages that are protected by a login procedure and you have a valid login to the site. A simple solution is issuing a POST request to the login page with the correct credentials, then continue to use the same cookie container to issue subsequent downloads, but in some situation this is not enough. Suppose the site uses some strange login procedure that uses redirect
A possible solution is using the WebBrowser control to navigate to login page, then locate the texboxes controls for UserName and password, locate the â€œsubmitâ€ button, wait for all redirect and finally grab the cookie from the webbrowser control. This solution is simple, because login procedure is executed inside a real Browser and we only need to grab the cookies when the whole procedure ends.
This is a sample of possible code.
Step 1: create a webbrowser control, handle DocumentCompleted event and then navigate to the login page.
DocumentCompleted is raised when the page is fully loaded, and is where we need to issue the login procedure.
Step 2: Locate the two input controls for username and password, fill them with right values, then locate the submit button and finally invoke the â€œclickâ€ method
As you can see the code is really simple, input control can be located by name, by id, or by classes, for this simple example I locate them by name with this simple function.
Step 3: Function to locate an input control by name.
This function is really simple, it iterates on all HTMLElement of type â€œinputâ€ present in the page, for each of them check if the name is equal to desidered one, and simply returns the element.
Step 4: Determine base uri of the site and grab all cookies thanks to the function CookieHelpers.GetUriCookieContainer
All the work is done inside the GetUriCookieContainer method, that use windows API to retrieve cookie, once the CookieContainer used by the WebBrowser is grabbed, you can simply get the CookieCollection and set to another CookieContainer that will be used by subsequent WebRequest object.
Step 5: declare import to use Windows API
Now we can use the InternetGetCookieEx to grab all the cookie.
Step 6: Grab all cookie with InternetGetCookieEx api, this is needed to retrieve HttpOnly cookie
Now the game is done. As a last warning I suggest you to clear all WebBrowser cookie before starting the login procedure, because it could lead to problems. I found this solution on StackOverflow (I do not remember the link sorry )
Snippet 1: Method to clear all the cookie, this is needed to be sure that the webControl has no cookies when login procedure begins.