Search This Blog

Labels

Friday, December 17, 2010

《扩展和嵌入Python解释器(译稿)》1.12 Providing a C API for an Extension Module  

1.12 Providing a C API for an Extension Module 为扩展模块提供C API

Many extension modules just provide new functions and types to be used from Python, but sometimes the code in an extension module can be useful for other extension modules. For example, an extension module could implement a type “collection” which works like lists without order. Just like the standard Python list type has a C API which permits extension modules to create and manipulate lists, this new collection type should have a set of C functions for direct manipulation from other extension modules.

许多扩展模块只是提供在Python中使用的新函数和新类型,但有时一个模块中的代码对另一个模块来说也是有用的。例如,一个模块可以实现一个类似无序列表一样的“集合”类型。就像标准Python的list类型有一个允许扩展模块创建和操作列表的C API,这个新的集合类型也应该有一套C函数用于在其他模块中进行直接操作。

At first sight this seems easy: just write the functions (without declaring them static, of course), provide an appropriate header file, and document the C API. And in fact this would work if all extension modules were always linked statically with the Python interpreter. When modules are used as shared libraries, however, the symbols defined in one module may not be visible to another module. The details of visibility depend on the operating system; some systems use one global namespace for the Python interpreter and all extension modules (Windows, for example), whereas others require an explicit list of imported symbols at module link time (AIX is one example), or offer a choice of different strategies (most Unices). And even if symbols are globally visible, the module whose functions one wishes to call might not have been loaded yet!

乍看起来这很容易:只要编写函数(当然了,不声明为static),提供一个适当的头文件,提供C API文档。事实上,如果所有的扩展模块总是与解释器静态链接的话,这是有效的。当模块用作共享库时,然而,在一个模块中定义的符号可能在另一个模块中就是不可见的了。可见性的细节依赖于操作系统,有些系统对Python解释器和全部扩展模块使用一个全局命名空间(例如 Windows),而其他的系统要求在模块链接时有一个明确的导入符号的列表(AIX 就是一例),或为不同的策略提供选择方案(大多数类Unix系统)。然而即使符号是全局可见的,想要调用其中函数的模块也可能尚未加载!

Portability therefore requires not to make any assumptions about symbol visibility. This means that all symbols in extension modules should be declared static, except for the module’s initialization function, in order to avoid name clashes with other extension modules (as discussed in section 1.4). And it means that symbols that should be accessible from other extension modules must be exported in a different way.

移植性的考虑要求不能对符号的可见性作任何假定。这意味着除了模块的初始化函数,扩展模块中所有的符号都应声明为static,以避免与其他模块的命名冲突(就像在1.4中讨论了的)。这还意味着从其他模块中可访问的变量必须以一种不同的方式导入。

Python provides a special mechanism to pass C-level information (pointers) from one extension module to another one: CObjects. A CObject is a Python data type which stores a pointer (void *). CObjects can only be created and accessed via their C API, but they can be passed around like any other Python object. In particular, they can be assigned to a name in an extension module’s namespace. Other extension modules can then import this module, retrieve the value of this name, and then retrieve the pointer from the CObject.

Python提供了一个特殊的机制来从一个扩展模块到另一个扩展模块传递C层面的信息(指针):CObjects。CObject是一个存储了一个指针(void *)的Python数据类型。CObject只经由他们的C API来创建和访问,但能像其他的Python对象一样被分发。特别是它们能赋值给另一个模块命名空间中的命名。这时另外的扩展模块能导入这个模块,析取这个命名的值,然后就从CObject中析取指针。

There are many ways in which CObjects can be used to export the C API of an extension module. Each name could get its own CObject, or all C API pointers could be stored in an array whose address is published in a CObject. And the various tasks of storing and retrieving the pointers can be distributed in different ways between the module providing the code and the client modules.

CObjects有许多方式可用于输出扩展模块的C API,每个命名可获得自已的CObject,或者所有的C API指针都能存储于一个其地址记载于一个Cobject内的数组中。并且各式各样存入和析取指针的任务能在供代码模块与客户模块之间以不同的方式分配。

The following example demonstrates an approach that puts most of the burden on the writer of the exporting module, which is appropriate for commonly used library modules. It stores all C API pointers (just one in the example!) in an array of void pointers which becomes the value of a CObject. The header file corresponding to the module provides a macro that takes care of importing the module and retrieving its C API pointers; client modules only have to call this macro before accessing the C API.

下面的例子演示了一个将大多数负担交给导入模块作者的方法,这对库模块通常的使用而言是恰当的。它将所有的C API指针(例子中只有一个!)存储于一个void指针数组之中。模块相应的头文件提供了一个宏来处理模块的导入和析取其中的C API,客户模块只需在访问C API前调用这个宏。

The exporting module is a modification of the spam module from section 1.1. The functionspam.system()does not call the C library function system()directly, but a functionPySpam_System(), which would ofcourse do something more complicated in reality (such as adding “spam” to every command). This functionPySpam_System()is also exported to other extension modules.

出口模块是1.1节中spam模块的一个修改版。函数spam.system()不直接调用C库函数system(),但函数PySpam_System()实际上当然会做一些更复杂的事(例如将spam添加给每条命令行)。这个函数也被导出给其他的扩展模块。

The function PySpam_System()is a plain C function, declared static like everything else:

PySpam_System()是一个平常的C函数,像其他的一样声明为static:

static int

PySpam_System(const char *command)

{

 return system(command);

}

The function spam_system()is modified in a trivial way:

spam_system()函数是以一个繁琐的方式被修改的:

static PyObject *

spam_system(PyObject *self, PyObject *args)

{

 const char *command;

 int sts;

 if (!PyArg_ParseTuple(args, "s", &command))

  return NULL;

 sts = PySpam_System(command);

 return Py_BuildValue("i", sts);

}

In the beginning of the module, right after the line

在模块的开头,就在下面这行之后

#include "Python.h"

two more lines must be added:

另两行必须添加:

#define SPAM_MODULE

#include "spammodule.h"

The #defineis used to tell the header file that it is being included in the exporting module, not a client module. Finally, the module’s initialization function must take care of initializing the C API pointer array:

#define 用来告诉头文件被包含进的是出口模块,而不是客户模块。最后,模块初始化函数必须处理C API指针数组的初始化。

PyMODINIT_FUNC

initspam(void)

{

 PyObject *m;

 static void *PySpam_API[PySpam_API_pointers];

 PyObject *c_api_object;

 m = Py_InitModule("spam", SpamMethods);

 /* Initialize the C API pointer array */

 PySpam_API[PySpam_System_NUM] = (void *)PySpam_System;

 /* Create a CObject containing the API pointer array’s address */

 c_api_object = PyCObject_FromVoidPtr((void *)PySpam_API, NULL);

 if (c_api_object != NULL)

  PyModule_AddObject(m, "_C_API", c_api_object);

}

Note that PySpam_APIis declared static; otherwise the pointer array would disappear when initspam()terminates!

注意 PySpam_API 被声明为 static;否则指针数组PySpam_API在initspam() 终止之后就会消失。

The bulk of the work is in the header file ‘spammodule.h’, which looks like this:

工作的绝大部分在头文件 spammodule.h 中,看起来就像这样:

#ifndef Py_SPAMMODULE_H

#define Py_SPAMMODULE_H

#ifdef __cplusplus

extern "C" {

#endif

/* Header file for spammodule */

/* C API functions */

#define PySpam_System_NUM 0

#define PySpam_System_RETURN int

#define PySpam_System_PROTO (const char *command)

/* Total number of C API pointers */

#define PySpam_API_pointers 1

#ifdef SPAM_MODULE

/* This section is used when compiling spammodule.c */

static PySpam_System_RETURN PySpam_System PySpam_System_PROTO;

#else

/* This section is used in modules that use spammodule’s API */

static void **PySpam_API;

#define PySpam_System \

(*(PySpam_System_RETURN (*)PySpam_System_PROTO) PySpam_API[PySpam_System_NUM])

/* Return -1 and set exception on error, 0 on success. */

static int

import_spam(void)

{

 PyObject *module = PyImport_ImportModule("spam");

 if (module != NULL) {

  PyObject *c_api_object = PyObject_GetAttrString(module, "_C_API");

  if (c_api_object == NULL)

   return -1;

  if (PyCObject_Check(c_api_object))

   PySpam_API = (void **)PyCObject_AsVoidPtr(c_api_object);

  Py_DECREF(c_api_object);

 }

return 0;

}

#endif

#ifdef __cplusplus

}

#endif

#endif /* !defined(Py_SPAMMODULE_H) */

All that a client module must do in order to have access to the function PySpam_System()is to call the function (or rather macro) import_spam()in its initialization function:

所有为了获得 PySpam_System() 函数入口的客户模块必须要做的是在其初始化函数中调用函数(更确切地说是宏)import_spam()。

PyMODINIT_FUNC

initclient(void)

{

 PyObject *m;

 Py_InitModule("client", ClientMethods);

 if (import_spam() < 0)

 return;

 /* additional initialization can happen here */

}

The main disadvantage of this approach is that the file ‘spammodule.h’ is rather complicated. However, the basic structure is the same for each function that is exported, so it has to be learned only once.

这一方法最主要的缺点是 spammodule.h 文件太复杂了。但其基本结构对每个用作出口的函数来说都是相同的,所以只需学一次。

Finally it should be mentioned that CObjects offer additional functionality, which is especially useful for memory allocation and deallocation of the pointer stored in a CObject. The details are described in the Python/C APIReference Manual in the section “CObjects” and in the implementation of CObjects (files ‘Include/cobject.h’ and ‘Objects/cobject.c’ in the Python source code distribution).

最后应提到的一点是 CObjects还提供了附加的功能,这对存储于内指针的内存分配与回收尤为重要。细节的描述在 Python/C APIReference Manual 中的Cobjects 一节和CObjects的实现中(源码发布版的文件和中)。

No comments:

Post a Comment