python爬虫入门01

167 阅读1分钟

Fn + F12 打开Chrome开发者工具 尝试访问学校教务处:

import urllib.request
responses = urllib.request.urlopen(url = "http://jw.scut.edu.cn/zhinan/cms/index.do",timeout = 5)
print(responses.read().decode('utf-8'))

运行结果:

image.png 具体内容:

<!DOCTYPE html>
<html>
<head>
        <meta charset="utf-8" />
        <meta http-equiv="X-UA-Compatible" content="IE=11">
        <meta http-equiv="Content-Type" content="text/html; charset=UTF-8">
        <meta http-equiv="Expires" content="0">
        <meta http-equiv="Pragma" content="no-cache">
        <meta http-equiv="Cache-control" content="no-cache">
        <meta http-equiv="Cache" content="no-cache">
        <title>华南理工大学教务处</title>
        <link rel="shortcut icon" href="/zhinan/../favicon.ico"/>
        <link rel="stylesheet" href="/zhinan/v2Static/css/base.css" />
        <link rel="stylesheet" href="/zhinan/v2Static/css/my.css" />
        <script type="text/javascript" src="/zhinan/v2Static/js/base/polyfill.min.js"></script>
        <!-- 若引入头尾要用el表达式,则去除引入script的include.js -->
        <script type="text/javascript" src="/zhinan/v2Static/js/base/include.js"></script>
        <script type="text/javascript" src="/zhinan/v2Static/js/base/enmus.js"></script>
        <script type="text/javascript" src="/zhinan/v2Static/js/base/common.js"></script>
</head>
<body>




<!-- header -->





<div id="header">
        <div class="header-top">
                <div class="fix-width">
                        <a href="mailto:j2jw@scut.edu.cn" class="black333 no_unl" style="float: right;">处长信箱</a><span>|</span>
                        <a href="https://www.scut.edu.cn/new/" target="_blank">华南理工大学主页</a><span>|</span>   
                        <a href="http://110.65.10.251:8089/zhinan/cms/admin/adminLogin.do" target="_blank">管理登录</a><span>|</span>
                        <a href="http://110.65.10.251:8083/jiaowuchu/cms/index.do" target="_blank">旧主页</a><span>|</span>



                </div>
        </div>
        <div class="header-middle">
                <div class="fix-width clearfix">
                        <img class="fl" src="/zhinan/v2Static/img/header/logo_left.png" width="467" onclick="toIndex();"/>
                        <div class="fr pr">
                                <img class="block" src="/zhinan/v2Static/img/header/logo_right.jpg" width="600" />  
                                <div class="search-btn">
                                        <form class="pa header-form" method="post" action="/zhinan/cms/category/v2/keyword.do">
                                                <input type="text" id="keyword" name="keyword" value="" style="border: none;"/>
                                                <img class="search-icon" onclick="searchClick()" src="/zhinan/v2Static/img/header/search_icon.png" width="18" />
                                                <input type="submit" style="display: none;" id="submitBtn"/>        
                                        </form>
                                </div>
                        </div>
                </div>
        </div>
        <div class="header-bottom">
                <div class="fix-width">
                        <div class="fr">
                                <div class="btn-link">
                                        <a class="link">学生服务</a>
                                        <ol class="btns" id="studentServeList">


                                                                <li class="btn"><a href="javascript:void(0);" onclick="serve('学业指导','http://jw.scut.edu.cn/zhinan/cms/category/index.do?id=ff8080816a9bdbc4016a9ff8e40b0000')">学业 
指导</a></li>



                                                                <li class="btn"><a href="javascript:void(0);" onclick="serve('学生手册','http://jw.scut.edu.cn/zhinan/cms/category/index.do?id=ff808081696f96b101696fd755450000')">学生 
手册</a></li>



                                                                <li class="btn"><a href="javascript:void(0);" onclick="serve('办事指南','http://jw.scut.edu.cn/zhinan/cms/category/index.do?id=ff8080816a53cdc0016a53ddc40f0000')">办事 
指南</a></li>



                                                                <li class="btn"><a href="javascript:void(0);" onclick="serve('常用表格','http://jw.scut.edu.cn/zhinan/cms/category/index.do?id=ff8080816a53cdc0016a53eee1bd0009')">常用 
表格</a></li>


                                        </ol>
                                </div>
                                <div class="btn-link">
                                        <a class="link">教师服务</a>
                                        <ol class="btns" id="teacherServeList">




                                                                <li class="btn"><a href="javascript:void(0);" onclick="serve('教师发展','http://110.65.10.226/index.do')">教师发展</a></li>



                                                                <li class="btn"><a href="javascript:void(0);" onclick="serve('办事指南','http://jw.scut.edu.cn/zhinan/cms/category/index.do?id=ff808081658ac16a016903d9a9bd09e6')">办事 
指南</a></li>



                                                                <li class="btn"><a href="javascript:void(0);" onclick="serve('常用表格','http://jw.scut.edu.cn/zhinan/cms/category/index.do?id=ff8080816a53cdc0016a540059cb000c')">常用 
表格</a></li>


                                        </ol>
                                </div>
                        </div>
                        <div class="hidden" id="menus">
                                <!-- 所有的都跳转到同一个second.html下,通过js交互处理页面的差异性 -->
                                <a href="javascript:void(0)" onclick="document.getElementById('navigation').scrollIntoView();" style="padding: 0 12px;background: #0065d3;font-weight:bold;font-size: 18px;">网站快速导航</a>
                                <a href="/zhinan/cms/index.do">首页</a>






                                                                        <a href="/zhinan/cms/category/index.do?id=ff8080816ff6cb4501703354e7f9002d" style="color: yellow;font-weight: bold;">线上教学资源</a>











                                                                        <a href="/zhinan/cms/category/index.do?id=ff808081696b990c01696c331a050002" >探究式学习</a>











                                                                        <a href="/zhinan/cms/category/index.do?id=ff808081696b990c01696c4c030f0005" >一流课程</a>











                                                                        <a href="/zhinan/cms/category/index.do?id=ff8080816d70d28b016d7aee19230007" >一流专业</a>











                                                                        <a href="/zhinan/cms/category/index.do?id=ff808081696b990c01696c4cdafd0006" >拔尖人才培养</a>











                                                                        <a href="/zhinan/cms/category/index.do?id=ff808081696b990c01696c4df4ac0007" >特色教改</a>







                        </div>
                </div>
        </div>
</div>

<script type="text/javascript">
    //回车
    document.onkeydown = function(e){
        var ev = document.all ? window.event : e;
        if(ev.keyCode==13) {
            var keyword = encodeURI($("#keyword").val());
            $("#keyword").val(keyword);
            $("#submitBtn").click();
        }
    };

    //点击搜索
    function searchClick() {
        var keyword = encodeURI($("#keyword").val());
        $("#keyword").val(keyword);
        $("#submitBtn").click();
    }

    //跳转到首页
    function toIndex() {
        location.href = basePath + "/cms/index.do";
    }

    //服务跳转
    function serve(name,link) {
        /*show2Loading("正在跳转到"+name+"......"); //渲染区域开启loading
        //延迟打开新的窗口
        setTimeout(function(){
            completeLoading();           //渲染区域关闭loading
            var newWindow=window.open();                 //打开新的窗口页面
            newWindow.location=link;//修改新打开的窗口页面的地址,这样可以防止新窗口打开被浏览器拦截
        }, 1000);*/

        var newWindow=window.open();             //打开新的窗口页面
        newWindow.location=link;//修改新打开的窗口页面的地址,这样可以防止新窗口打开被浏览器拦截
    }
</script>


<!-- wrapper -->
<div id="wrapper">
        <!-- carousel -->
        <div id="carousel"></div>

        <!-- quick -->
        <div class="quick">
                <div class="fix-width pr">
                        <div class="date">
                                <div class="date-top">


                                                <p><span class="f55">9</span>教学周</p>
                                                <p class="mt5">2021年10月31日&nbsp;&nbsp;星期日</p>


                                </div>
                                <div class="date-bottom">
                                        <p class="data-bottom">2021-2022学年</p>
                                        <p class="data-bottom">第1学期</p>

                                        <a class="date-btn" href="/zhinan/cms/category/index.do?id=f25701314e913361014e91456f810011">查看校历、作息安排</a>
                                </div>
                        </div>
                        <div class="entry">
                                <div class="entry-link">
                                        <img class="block" src="/zhinan/v2Static/img/index/system.png" /><img class="none" src="/zhinan/v2Static/img/index/system_light.png" />
                                        <div class="block">教务系统</div>
                                        <div class="none">
                                                <a href="http://xsjw2018.jw.scut.edu.cn" target="_blank">学生</a><span class="mlr10">|</span>
                                                <a href="http://jw2018.jw.scut.edu.cn" target="_blank">教师</a>     
                                        </div>
                                </div>
                                <div class="entry-link">
                                        <img class="block" src="/zhinan/v2Static/img/index/online.png" /><img class="none" src="/zhinan/v2Static/img/index/online_light.png" onclick="online()"/>
                                        <a href="http://eonline.jw.scut.edu.cn/meol/index.do" target="_blank">教学在
线</a>
                                </div>
                                <div class="entry-link">
                                        <img class="block" src="/zhinan/v2Static/img/index/print.png" /><img class="none" src="/zhinan/v2Static/img/index/print_light.png" onclick="print()"/>
                                        <a href="http://www.scut.edu.cn/studentbooking/" target="_blank">出国成绩打 
印</a>
                                </div>
                                <div class="entry-link">
                                        <img class="block" src="/zhinan/v2Static/img/index/project.png" /><img class="none" src="/zhinan/v2Static/img/index/project_light.png" onclick="project()"/>
                                        <a href="http://110.65.10.252/cxxl/Index.aspx" target="_blank">学生项目管理</a>
                                </div>
                                <div class="entry-link">
                                        <img class="block" src="/zhinan/v2Static/img/index/report.png" /><img class="none" src="/zhinan/v2Static/img/index/report_light.png" onclick="report()"/>
                                        <a href="http://vpcs.cqvip.com/organ/lib/scut/" target="_blank">实践教学平台
</a>
                                </div>
                                <