距离上次的那篇《《面试官一个小时逼疯面试者》之聊聊Python Import System?》已经过去了一个月的时间了,这期间请教了很多位大佬,询问他们关于我之前文章的意见。前前后后收集了将近10位大佬以及近百位读者的意见,他们的统一观点是:“文章太干了,不适合用来面试,竟然在Python玩起了Java八股文那一套,好家伙,面试都得深入到Cpython源码
了,这谁能受得住?”
好吧,我承认我这次是有标题党的嫌疑了,加上没有太多面试官的经验,不太清楚面试中如何考察面试者的能力,文章写得偏题了。因此这次我卷土重来,结合“百家之言”,再和大家分享下基于Import System
可以延展开来的面试题。我将此次的分享分为两个部分,一个是关于我们日常使用Import System
的一些技巧的原理,另一方面是关于Import System
的真实场景的解决方案。
关注公众号《技术拆解官》,回复“import”获取高清PDF阅读版本
首先带来的是原理篇
一、原理篇
原理篇不是和上篇文章一样真正的深入地结合底层源码去梳理Cpython源码
的逻辑,所以大家可以放心,这里的原理只是相对于真实场景中项目问题来说的,比如我们要从零开始写一个Web服务框架,而原理就是我们需要懂得各层的网络协议、应用服务器、Web服务器等等。有些知识点在上篇文章中已经提及到了,这里就直接引用了,大家有不熟悉的地方可以重刷下《《面试官一个小时逼疯面试者》之聊聊Python Import System?》
1 Python包搜索路径
对于Python开发者来说,最初接触到Import System
中让人头疼的点就是关于Python包搜索路径的问题,这个问题也是面试中经常会被问到的问题,那么我们第一个问题需要解决的问题就是关于Python包搜索路径的,那么Python包搜索路径的顺序是什么样的呢?我们该如何去理解呢?我们从之前文章的import
关键字流程图中就可以得到答案,重温下那张流程图
首先要注意的是,在import
关键字流程图中有三个list结构是不可忽视的,分别是sys.path
、sys.meta_path
、sys.path_hooks
。
想要了解Python包的搜索路径,归根结底也就是要了解sys.meta_path
中Importer
的顺序,在源码中是循环sys.meta_path
中的Importer
并传入path
来进行搜索的,而这里的path
是来源于parent_module.__path__
,当path
为空时,就被赋值为sys.path
。而在sys.meta_path
中,能用到sys.path
的主要是PathFinder
,因此我们可以首先得出一个结论,最先被搜索用到的类是PathFinder
前面的两个Importer
,也就是BuiltinImporter
和FrozenImporter
,搜索的范围是内置模块和frozen module,所以到目前为止的顺序是
接着看下来第二部分就到了由第三个Importer
---PathFinder
去循环sys.path
列表的阶段了,因此,既然是循环sys.path
,那么sys.path
的顺序应该是整个搜索链路的一个子集,我们先来看看sys.path
的结果是
(base) [root@VM-0-8-centos ~]# python -m site
sys.path = [
'/root', # 项目的根目录
'/root/miniconda3/lib/python38.zip', # 当前环境的标准包
'/root/miniconda3/lib/python3.8',
'/root/miniconda3/lib/python3.8/lib-dynload',
'/root/miniconda3/lib/python3.8/site-packages', # 当前环境的三方包
]
USER_BASE: '/root/.local' (doesn't exist)
USER_SITE: '/root/.local/lib/python3.8/site-packages' (doesn't exist)
ENABLE_USER_SITE: True
上面是标准的sys.path
路径列表的结果,我们可以看出搜索路径的顺序是
根目录 -> 标准包 -> 三方包
不过这里有两个问题需要注意,一个是PYTHONPATH
的设置问题,上面可以看到我们并没有显式的设置PYTHONPATH
。首先一起了解下什么是PYTHONPATH
(引用PYTHONPATH-python-docs)
Augment the default search path for module files. The format is the same as the shell’s
PATH
: one or more directory pathnames separated byos.pathsep
(e.g. colons on Unix or semicolons on Windows). Non-existent directories are silently ignored. (作为包搜索路径的扩展,格式与普通shell格式相同,可以添加多个路径)In addition to normal directories, individual
PYTHONPATH
entries may refer to zipfiles containing pure Python modules (in either source or compiled form). Extension modules cannot be imported from zipfiles.The default search path is installation dependent, but generally begins with
*prefix*/lib/python*version*
(seePYTHONHOME
above). It is always appended toPYTHONPATH
.An additional directory will be inserted in the search path in front of
PYTHONPATH
as described above under Interface options. The search path can be manipulated from within a Python program as the variablesys.path
. (PYTHONPATH
将被加入sys.path
中被使用)
从上面的解释中可以得知,我们可以通过分好分隔来添加多个路径,这些路径将会被插入到sys.path
中被使用。那么下面我们指定PYTHONPATH
路径来测试下结果
(base) [root@VM-0-8-centos ~]# export PYTHONPATH=/root/test 指定具体路径(路径可以不存在)
(base) [root@VM-0-8-centos ~]# python -m site
sys.path = [
'/root',
'/root/test',
'/root/miniconda3/lib/python38.zip',
'/root/miniconda3/lib/python3.8',
'/root/miniconda3/lib/python3.8/lib-dynload',
'/root/miniconda3/lib/python3.8/site-packages',
]
USER_BASE: '/root/.local' (doesn't exist)
USER_SITE: '/root/.local/lib/python3.8/site-packages' (doesn't exist)
ENABLE_USER_SITE: True
可以看到,PYTHONPATH
的路径被加入到sys.path
中了,并且顺序是排在当前环境内置包之前、根目录之后。
这里有个有趣的点是我们手动指定PYTHONPATH
的方式有点像切换虚拟环境,对于不同的虚拟环境我们只需要指定下PYTHONPATH
就可以更改python的版本、包的版本了,相当于一个劣质的环境管理方案。当然了,其实用PYTHONPATH
的方式来实现虚拟环境管理会有很多问题,这和真实的虚拟环境管理的实现原理还是不一样的(虚拟环境主要利用的原理是改变的是包的路径以及激活当前环境的$PATH
,而PYTHONPATH
的路径由于是被加到了虚拟环境的目录之前,因此会影响到所有虚拟环境的),这个我们之后再出篇文章来好好聊聊。
上面我们说到了PYTHONPATH
的问题之后,再来看看另一个注意点,pth
文件,关于pth
文件的介绍可以参考pep-0648,大概含义也正如标题所说的
Extensible customizations of the interpreter at startup (解释器启动时的可扩展自定义)
我们可以依赖pth
文件来自定义包的加载方式,而pth
指定的路径会被加入到sys.path
当中
Note that
pth
files were originally developed to just add additional directories tosys.path
, but they may also contain lines which start with "import", which will be passed toexec()
. Users have exploited this feature to allow the customizations that they needed. See setuptools [4] or betterexceptions [5] as examples.
那么具体是怎么使用呢?我们需要到site-packages
目录下,新建一个pth
文件(这里需要注意的是,虽然我们可以把pth
放置到各个Python解释器可以访问到的地方,但是由于pth
是专门负责包导入的扩展,并且加载的顺位排在三方包之后,因此通常把它放置在site-packages
目录中)
(base) [root@VM-0-8-centos site-packages]# cd /root/miniconda3/lib/python3.8/site-packages
(base) [root@VM-0-8-centos site-packages]# echo "/root/test" > test.pth
(base) [root@VM-0-8-centos site-packages]# mkdir /root/test
mkdir: cannot create directory ‘/root/test’: File exists
(base) [root@VM-0-8-centos site-packages]# python -m site
sys.path = [
'/root/miniconda3/lib/python3.8/site-packages',
'/root/miniconda3/lib/python38.zip',
'/root/miniconda3/lib/python3.8',
'/root/miniconda3/lib/python3.8/lib-dynload',
'/root/test',
]
USER_BASE: '/root/.local' (doesn't exist)
USER_SITE: '/root/.local/lib/python3.8/site-packages' (doesn't exist)
ENABLE_USER_SITE: True
可以看到我们在site-packages
目录下新建了pth
文件,并在其中指定了相应的路径,在解释器启动的同时,会遍历python
的所有可触达目录,找到pth
文件时会解析文件内容将其中的路径导入sys.path
当中(当然,如pep中提到的,可导入的并不只是路径),作为额外的搜索路径。
到这里我们再来总结下整体的搜索链路是
这里可能大家会有疑问,Python中内置库和标准库不是一个东西吗?其实不是的,在Python官方文档中有提及
Python’s standard library is very extensive, offering a wide range of facilities as indicated by the long table of contents listed below. The library contains built-in modules (written in C) that provide access to system functionality such as file I/O that would otherwise be inaccessible to Python programmers, as well as modules written in Python that provide standardized solutions for many problems that occur in everyday programming. Some of these modules are explicitly designed to encourage and enhance the portability of Python programs by abstracting away platform-specifics into platform-neutral APIs.
它说明到内置模块是用C语言写的,提供了对系统功能的访问。比如从Python的标准库路径下面是找不到sys
这个库的,原因就是它是操作系统相关,用C语言编写的。可以看到asyncio
这个模块,它是用Python写的。
虽然这段解释说明了内置模块不是标准库,但是内置模块可以划分到标准库一类中去,这里需要注意的是划分归类,而不是本质相同。这里说明内置模块不是标准库似乎有点吹毛求疵的意思,好像区不区分它们没有什么意义?是的,大多数情况下,对它们之间没有做区分的必要。但是对于我们理解Python的模块查找顺序时,这却是一个至关重要的差异。上面我们已经讲解了Python模块的搜索路径顺序,可以看到我们的项目根目录处于内置库和标准库之间,想象一个场景,当我们本地有和内置库、标准库同名的模块文件时,谁会被覆盖呢?
2 导入协议和Hooks注册
之前的文章中我们梳理了Import System
的核心流程,也了解到了我们可以通过对Import
流程中的各阶段进行import hook
来自定义模块导入方式,那么接下来我们就一起来看看如何利用Import System
的Importer Protocol
来开发我们自定义的importer
并完成import hook
的注册。
首先我们需要明白的是什么是Importer Protocol
,可参考pep-0302-specification-part-1-the-importer-protocol的解释,Importer Protocol
主要包含两个部分的调用,分别是finder
和loader
,也就是查找器和加载器,而其中起到作用的则是finder.find_spec
和loader.exec_module
两个方法(针对Python 3.4
之后的版本,之前的版本分别是find_module
和load_module
两个方法)。
从上面,我们可以得知要实现自定义的importer
需要做两个方面的工作
-
实现
Finder
协议 -
实现
Loader
协议
在了解两个协议的实现原理之前我们需要注意的是它们两者之前的连接者,也就是ModuleSpec
(模块规范)。
2.1 ModuleSpec(模块规范)
什么是模块规范呢?从官方文档来理解下
The import machinery uses a variety of information about each module during import, especially before loading. Most of the information is common to all modules. The purpose of a module’s spec is to encapsulate this import-related information on a per-module basis. (导入机制在导入期间(尤其是在加载之前)会使用有关每个模块的各种信息。 大多数信息是所有模块共有的。 模块规范的目的是在每个模块的基础上封装与导入相关的信息。)
Using a spec during import allows state to be transferred between import system components, e.g. between the finder that creates the module spec and the loader that executes it. Most importantly, it allows the import machinery to perform the boilerplate operations of loading, whereas without a module spec the loader had that responsibility. (作为查找器和加载器之间的中间状态传输)
The module’s spec is exposed as the
__spec__
attribute on a module object. SeeModuleSpec
for details on the contents of the module spec. (模块的规范都可以通过__spec__
属性来获取)
简单理解,就是对于模块信息的一种整合,而ModuleSpec
也是在Python 3.4
之后正式推出的。依据PEP 451 -- A ModuleSpec Type for the Import System,ModuleSpec
是用来替代有查找器返回的加载器,将两者解耦开来,统一封装模块信息。
下面来看看__spec__
中都包含了什么
(base) [root@VM-0-8-centos ~]# python
Python 3.8.5 (default, Sep 4 2020, 07:30:14)
[GCC 7.3.0] :: Anaconda, Inc. on linux
Type "help", "copyright", "credits" or "license" for more information.
>>> import sys
>>> sys.__spec__
ModuleSpec(name='sys', loader=<class '_frozen_importlib.BuiltinImporter'>)
>>>
包括模块的名字和加载器,当然,还有其中没显示的其他属性
On ModuleSpec | On Modules |
---|---|
name | __name__ |
loader | __loader__ |
parent | __package__ |
origin | __file__ |
cached | __cached__ |
submodule_search_locations | __path__ |
loader_state | - |
has_location |
我们可以从每个module
的属性中来获取到module
对应的ModuleSpec
的值,比如
(base) [root@VM-0-8-centos ~]# python
Python 3.8.5 (default, Sep 4 2020, 07:30:14)
[GCC 7.3.0] :: Anaconda, Inc. on linux
Type "help", "copyright", "credits" or "license" for more information.
>>> import sys
>>> sys.__name__
'sys'
>>> sys.__spec__.name
'sys'
>>>
可以看出,使用ModuleSpec
相比于直接返回加载器来说,包含的信息更加丰富,也更利于我们做二次开发。
2.2 查找器
核心原理:通过
find_spec
返回ModuleSpec
或者None
官方内置的查找器类存在于sys.meta_path
中,它们都有一个共同的方法,find_spec
,正是通过这个方法返回的ModuleSpec
对象,才能让加载器能够获取到ModuleSpec
对象进行模块加载,所以我们想要自定义一个查找器,核心原理就是要实现find_spec
方法并且返回具体类的ModuleSpec
对象。
对于底层的类的自定义实现,Python
中都会存在相应的抽象类,我们直接继承实现就好,针对于Finder
的抽象类主要是以下两个
class importlib.abc.MetaPathFinder
def find_spec(fullname, path, target=None)
'''
An abstract method for finding a spec for the specified module. If this is a top-level import, path will be None. Otherwise, this is a search for a subpackage or module and path will be the value of __path__ from the parent package. If a spec cannot be found, None is returned. When passed in, target is a module object that the finder may use to make a more educated guess about what spec to return. importlib.util.spec_from_loader() may be useful for implementing concrete MetaPathFinders. (一个为指定模块寻找规范的抽象方法。如果这是顶级导入,path将是None,否则,这是在搜索子包或模块,并且路径将是父包中__path__的值。如果找不到规范,则返回None。当传入时,target是一个模块对象,查找器可以使用它来对要返回的规范进行更有根据的猜测,importlib.util.spec_from_loader()可能在实现具体的元路径查找器时有用。)
'''
class importlib.abc.PathEntryFinder
def find_spec(fullname, path, target=None)
'''
An abstract method for finding a spec for the specified module. The finder will search for the module only within the path entry to which it is assigned. If a spec cannot be found, None is returned. When passed in, target is a module object that the finder may use to make a more educated guess about what spec to return. importlib.util.spec_from_loader() may be useful for implementing concrete PathEntryFinders. (一个为指定模块寻找规范的抽象方法。如果这是顶级导入,path将是None,否则,这是在搜索子包或模块,并且路径将是父包中__path__的值。如果找不到规范,则返回None。当传入时,target是一个模块对象,查找器可以使用它来对要返回的规范进行更有根据的猜测,importlib.util.spec_from_loader()可能在实现具体的路径入口查找器时有用。)
'''
大家看到这里,都会对这两个Finder
产生疑惑,到底该继承哪个抽象类来实现我们的功能呢?其实官方已经为我们做出说明了,虽然它们两个确实很相似,但是它们的作用域是不同的
path entry finder
A finder returned by a callable on
sys.path_hooks
which knows how to locate modules given a path entry.See
importlib.abc.PathEntryFinder
for the methods that path entry finders implement.
meta path finder
A finder returned by a search of
sys.meta_path
. Meta path finders are related to, but different from path entry finders.See
importlib.abc.MetaPathFinder
for the methods that meta path finders implement.
两者分别是需要被加入sys.path_hooks
和sys.meta_path
的,我们也可以从源码中发现一些踪迹
_register(PathEntryFinder, machinery.FileFinder)
_register(MetaPathFinder, machinery.BuiltinImporter, machinery.FrozenImporter,
machinery.PathFinder, machinery.WindowsRegistryFinder)
def _register(abstract_cls, *classes):
for cls in classes:
# 抽象类注册抽象子类
abstract_cls.register(cls)
if _frozen_importlib is not None:
try:
frozen_cls = getattr(_frozen_importlib, cls.__name__)
except AttributeError:
frozen_cls = getattr(_frozen_importlib_external, cls.__name__)
abstract_cls.register(frozen_cls)
PathEntryFinder
类注册了FileFinder
成为了它的抽象子类,而FileFinder
是sys.path_hooks
中的一个hook方法
MetaPathFinder
类注册了BuiltinImporter
、FrozenImporter
、PathFinder
成为它的抽象子类,而相对应的这些都是来源于sys.meta_path
额外的知识点提示:Python中ABC抽象类直接继承和使用register的区别是?
对于开发者来说可以根据需要来进行选择,无疑MetaPathFinder
的作用域更深。
我们想要实现finder
的协议只需要新建类继承MetaPathFinder
,实现其中的find_spec
方法,返回特定类的ModuleSpec
即可。
2.3 加载器
核心原理:
exec_module
是关键方法,核心流程相同,不同类型的Loader的扩展方式不同
相比较于查找器的find_spec
方法来说,加载器的exec_module
由于涉及到具体加载模块,所以原理无疑是更复杂一些。但是因为ModuleSpec
的推出,在实现步骤上也省略了很多,我们通过官方文档来具体看下老版本load_module
(Python 3.4之前的加载方法)需要完成哪些事情
If there is an existing module object named 'fullname' in
sys.modules
, the loader must use that existing module. (Otherwise, thereload()
builtin will not work correctly.) If a module named 'fullname' does not exist insys.modules
, the loader must create a new module object and add it tosys.modules
.Note that the module object must be in
sys.modules
before the loader executes the module code. This is crucial because the module code may (directly or indirectly) import itself; adding it tosys.modules
beforehand prevents unbounded recursion in the worst case and multiple loading in the best.If the load fails, the loader needs to remove any module it may have inserted into
sys.modules
. If the module was already insys.modules
then the loader should leave it alone.The
__file__
attribute must be set. This must be a string, but it may be a dummy value, for example "". The privilege of not having a__file__
attribute at all is reserved for built-in modules.The
__name__
attribute must be set. If one usesimp.new_module()
then the attribute is set automatically.If it's a package, the
__path__
variable must be set. This must be a list, but may be empty if__path__
has no further significance to the importer (more on this later).The
__loader__
attribute must be set to the loader object. This is mostly for introspection and reloading, but can be used for importer-specific extras, for example getting data associated with an importer.The
__package__
attribute must be set.
从文档中我们可以看到在正式执行load_module
方法加载模块前,我们要为模块做大量的模块属性赋值,再将模块导入,再看下官方文档给出的具体案例
# Consider using importlib.util.module_for_loader() to handle
# most of these details for you.
def load_module(self, fullname):
# 获取源码
code = self.get_code(fullname)
ispkg = self.is_package(fullname)
# 获取module,开始手动赋值
mod = sys.modules.setdefault(fullname, imp.new_module(fullname))
mod.__file__ = "<%s>" % self.__class__.__name__
mod.__loader__ = self
if ispkg:
mod.__path__ = []
mod.__package__ = fullname
else:
mod.__package__ = fullname.rpartition('.')[0]
# exec执行源码,加载到__dict__
exec(code, mod.__dict__)
return mod
从官方推出ModuleSpec
之后,关于赋值的步骤已经有函数实现了,我们可以直接省略,现在可以这么来做
# 直接利用ModuleSpec解析得到的module对象来进行导入
def _new_module(name):
# 通过type关键字新建类
return type(sys)(name)
def _init_module_attrs(spec, module, *, override=False):
# The passed-in module may be not support attribute assignment,
# in which case we simply don't set the attributes.
# __name__
if (override or getattr(module, '__name__', None) is None):
try:
module.__name__ = spec.name
except AttributeError:
pass
# __loader__
if override or getattr(module, '__loader__', None) is None:
loader = spec.loader
if loader is None:
# A backward compatibility hack.
if spec.submodule_search_locations is not None:
if _bootstrap_external is None:
raise NotImplementedError
_NamespaceLoader = _bootstrap_external._NamespaceLoader
loader = _NamespaceLoader.__new__(_NamespaceLoader)
loader._path = spec.submodule_search_locations
spec.loader = loader
# While the docs say that module.__file__ is not set for
# built-in modules, and the code below will avoid setting it if
# spec.has_location is false, this is incorrect for namespace
# packages. Namespace packages have no location, but their
# __spec__.origin is None, and thus their module.__file__
# should also be None for consistency. While a bit of a hack,
# this is the best place to ensure this consistency.
#
# See # https://docs.python.org/3/library/importlib.html#importlib.abc.Loader.load_module
# and bpo-32305
module.__file__ = None
try:
module.__loader__ = loader
except AttributeError:
pass
# __package__
if override or getattr(module, '__package__', None) is None:
try:
module.__package__ = spec.parent
except AttributeError:
pass
# __spec__
try:
module.__spec__ = spec
except AttributeError:
pass
# __path__
if override or getattr(module, '__path__', None) is None:
if spec.submodule_search_locations is not None:
try:
module.__path__ = spec.submodule_search_locations
except AttributeError:
pass
# __file__/__cached__
if spec.has_location:
if override or getattr(module, '__file__', None) is None:
try:
module.__file__ = spec.origin
except AttributeError:
pass
if override or getattr(module, '__cached__', None) is None:
if spec.cached is not None:
try:
module.__cached__ = spec.cached
except AttributeError:
pass
return module
def module_from_spec(spec):
"""Create a module based on the provided spec."""
# Typically loaders will not implement create_module().
module = None
if hasattr(spec.loader, 'create_module'):
# If create_module() returns `None` then it means default
# module creation should be used.
module = spec.loader.create_module(spec)
elif hasattr(spec.loader, 'exec_module'):
raise ImportError('loaders that define exec_module() '
'must also define create_module()')
if module is None:
module = _new_module(spec.name)
_init_module_attrs(spec, module)
return module
mod = importlib.util.module_from_spec(spec)
spec.loader.exec_module(mod)
def exec_module(self, module):
filename = self.get_filename(self.fullname)
poc_code = self.get_data(filename)
obj = compile(poc_code, filename, 'exec', dont_inhert=True, optimize=-1)
exec(obj, module.__dict__)
讲完了exec_module
协议的基本原理之后,我们再来模仿Finder
的开发模式,看看我们可以使用哪些官方提供的抽象类来开发,从官方文档-(importlib.abc)[docs.python.org/3/library/i…
object
+-- Finder (deprecated)
| +-- MetaPathFinder
| +-- PathEntryFinder
+-- Loader
+-- ResourceLoader --------+
+-- InspectLoader |
+-- ExecutionLoader --+
+-- FileLoader
+-- SourceLoader
Loader
抽象类主要分为三类,一类是ResourceLoader
,在Python 3.7
版本之后已经被ResourceReader
所替代了,对于这个类的实现,官方建议是与特定的资源相匹配,也就是只加载具体的包。
另外两个Loader
都是源于InspectLoader
,因此它们的底层实现逻辑是相同的
# importlib.abc
class Loader(metaclass=abc.ABCMeta):
"""Abstract base class for import loaders."""
def create_module(self, spec):
"""Return a module to initialize and into which to load.
This method should raise ImportError if anything prevents it
from creating a new module. It may return None to indicate
that the spec should create the new module.
"""
# By default, defer to default semantics for the new module.
return None
# We don't define exec_module() here since that would break
# hasattr checks we do to support backward compatibility.
def load_module(self, fullname):
"""Return the loaded module.
The module must be added to sys.modules and have import-related
attributes set properly. The fullname is a str.
ImportError is raised on failure.
This method is deprecated in favor of loader.exec_module(). If
exec_module() exists then it is used to provide a backwards-compatible
functionality for this method.
"""
if not hasattr(self, 'exec_module'):
raise ImportError
return _bootstrap._load_module_shim(self, fullname)
def module_repr(self, module):
"""Return a module's repr.
Used by the module type when the method does not raise
NotImplementedError.
This method is deprecated.
"""
# The exception will cause ModuleType.__repr__ to ignore this method.
raise NotImplementedError
class InspectLoader(Loader):
"""Abstract base class for loaders which support inspection about the
modules they can load.
This ABC represents one of the optional protocols specified by PEP 302.
"""
def is_package(self, fullname):
"""Optional method which when implemented should return whether the
module is a package. The fullname is a str. Returns a bool.
Raises ImportError if the module cannot be found.
"""
raise ImportError
def get_code(self, fullname):
"""Method which returns the code object for the module.
The fullname is a str. Returns a types.CodeType if possible, else
returns None if a code object does not make sense
(e.g. built-in module). Raises ImportError if the module cannot be
found.
"""
source = self.get_source(fullname)
if source is None:
return None
return self.source_to_code(source)
@abc.abstractmethod
def get_source(self, fullname):
"""Abstract method which should return the source code for the
module. The fullname is a str. Returns a str.
Raises ImportError if the module cannot be found.
"""
raise ImportError
@staticmethod
def source_to_code(data, path='<string>'):
"""Compile 'data' into a code object.
The 'data' argument can be anything that compile() can handle. The'path'
argument should be where the data was retrieved (when applicable)."""
return compile(data, path, 'exec', dont_inherit=True)
exec_module = _bootstrap_external._LoaderBasics.exec_module
load_module = _bootstrap_external._LoaderBasics.load_module
核心功能无疑是exec_module
方法,但是InspectLoader
在此基础上实现了几个拓展协议,参考pep-0302-Optional Extensions to the Importer Protocol。
2.4 hooks注册
上面我们在实现好导入协议之后,就需要将自定义的importer
进行注册才能使用,而根据注册的方式又分为两个hooks
,Meta hooks
和Path hooks
2.4.1 Meta Hooks
Meta hooks
是在import
流程的初始时进行调用的,我们可以将Meta hooks
插入sys.meta_path
的任意一个位置,当然,也可以放在最前面,这样就可以重载内置模块、frozen module等等
2.4.2 Path hooks
相反的,对于Path hooks
来说,作用域仅仅局限在sys.path
的路径列表,注册的方法是作为callables
插入sys.path_hooks
中,Path hooks
处理路径的结果会保存在sys.path_importer_cache
当中,每次触发Path hooks
时都会预先进行检查。
这次我们的原理篇就到这里,下一次我们引入真实场景,看看在实际工作中Import System
可以用来解决哪些问题。