今天9.10日是教师节,早上上班的时候看到路边的花店生意可以好了,看来送花是很多同学的第一选择,还有一些同学觉得全部都送花太俗了。但是也不知道给老师送什么,所以当我看到老师备课时的辛苦就决定用python爬虫帮老师在网上找课件,于是我找到了这个网址:www.leleketang.com,里面的课件做的特别好,还有视频,但是需要账号,账号申请很麻烦很多老师都没有。所以我这里我的爬虫技术就显的很特殊了,所以这份礼物应该很特别吧。
我在访问目标网站的时候发现需要登录账号,需要数据少些是没有问题的,稍微多点就不行了,这个网站封ip也是挺狠的,所以这时就必须要使用代理了。那么我给小伙伴们示例下具体代码过程:
// 要访问的目标页面
string targetUrl = "www.leleketang.com";
// 代理服务器(产品官网 www.16yun.cn)
string proxyHost = "http://t.16yun.cn";
string proxyPort = "31111";
// 代理验证信息
string proxyUser = "username";
string proxyPass = "password";
// 设置代理服务器
WebProxy proxy = new WebProxy(string.Format("{0}:{1}", proxyHost, proxyPort), true);
ServicePointManager.Expect100Continue = false;
var request = WebRequest.Create(targetUrl) as HttpWebRequest;
request.AllowAutoRedirect = true;
request.KeepAlive = true;
request.Method = "GET";
request.Proxy = proxy;
//request.Proxy.Credentials = CredentialCache.DefaultCredentials;
request.Proxy.Credentials = new System.Net.NetworkCredential(proxyUser, proxyPass);
// 设置Proxy Tunnel
// Random ran=new Random();
// int tunnel =ran.Next(1,10000);
// request.Headers.Add("Proxy-Tunnel", String.valueOf(tunnel));
//request.Timeout = 20000;
//request.ServicePoint.ConnectionLimit = 512;
//request.UserAgent = "Mozilla/5.0 (Windows NT 6.3; Win64; x64) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/48.0.2564.82 Safari/537.36";
//request.Headers.Add("Cache-Control", "max-age=0");
//request.Headers.Add("DNT", "1");
//String encoded = System.Convert.ToBase64String(System.Text.Encoding.GetEncoding("ISO-8859-1").GetBytes(proxyUser + ":" + proxyPass));
//request.Headers.Add("Proxy-Authorization", "Basic " + encoded);
using (var response = request.GetResponse() as HttpWebResponse)
using (var sr = new StreamReader(response.GetResponseStream(), Encoding.UTF8))
{
string htmlStr = sr.ReadToEnd();
}
好啦,一种新的送礼物方法就分享到这里了,有需要的同学赶快get起来来吧。