實現緩沖協議 · Cython 3.0 中文文檔

# 實現緩沖協議 > 原文： [http://docs.cython.org/en/latest/src/userguide/buffer.html](http://docs.cython.org/en/latest/src/userguide/buffer.html) Cython 對象可以通過實現“緩沖協議”將內存緩沖區暴露給 Python 代碼。本章介紹如何實現協議并使用 NumPy 中擴展類型管理的內存。 ## 矩陣類以下 Cython / C ++代碼實現了一個浮點矩陣，其中列數在構造時固定，但行可以動態添加。 ```py # distutils: language = c++ # matrix.pyx from libcpp.vector cimport vector cdef class Matrix: cdef unsigned ncols cdef vector[float] v def __cinit__(self, unsigned ncols): self.ncols = ncols def add_row(self): """Adds a row, initially zero-filled.""" self.v.resize(self.v.size() + self.ncols) ``` 沒有方法可以用矩陣的內容做任何有效的工作。我們可以為此實現自定義`__getitem__`，`__setitem__`等，但我們將使用緩沖協議將矩陣的數據暴露給 Python，這樣我們就可以使用 NumPy 來完成有用的工作。實現緩沖協議需要添加兩個方法，`__getbuffer__`和`__releasebuffer__`，Cython 專門處理。 ```py # distutils: language = c++ from cpython cimport Py_buffer from libcpp.vector cimport vector cdef class Matrix: cdef Py_ssize_t ncols cdef Py_ssize_t shape[2] cdef Py_ssize_t strides[2] cdef vector[float] v def __cinit__(self, Py_ssize_t ncols): self.ncols = ncols def add_row(self): """Adds a row, initially zero-filled.""" self.v.resize(self.v.size() + self.ncols) def __getbuffer__(self, Py_buffer *buffer, int flags): cdef Py_ssize_t itemsize = sizeof(self.v[0]) self.shape[0] = self.v.size() / self.ncols self.shape[1] = self.ncols # Stride 1 is the distance, in bytes, between two items in a row; # this is the distance between two adjacent items in the vector. # Stride 0 is the distance between the first elements of adjacent rows. self.strides[1] = <Py_ssize_t>( <char *>&(self.v[1]) - <char *>&(self.v[0])) self.strides[0] = self.ncols * self.strides[1] buffer.buf = <char *>&(self.v[0]) buffer.format = 'f' # float buffer.internal = NULL # see References buffer.itemsize = itemsize buffer.len = self.v.size() * itemsize # product(shape) * itemsize buffer.ndim = 2 buffer.obj = self buffer.readonly = 0 buffer.shape = self.shape buffer.strides = self.strides buffer.suboffsets = NULL # for pointer arrays only def __releasebuffer__(self, Py_buffer *buffer): pass ``` 方法`Matrix.__getbuffer__`填充由 Python C-API 定義的稱為`Py_buffer`的描述符結構。它包含指向內存中實際緩沖區的指針，以及有關數組形狀和步幅的元數據（從一個元素或行到下一個元素或行的步長）。它的`shape`和`strides`成員是必須指向類型和大小的數組`Py_ssize_t[ndim]`的指針。只要任何緩沖區查看數據，這些數組就必須保持活動狀態，因此我們將它們作為成員存儲在`Matrix`對象上。代碼尚未完成，但我們已經可以編譯它并測試基本功能。 ```py >>> from matrix import Matrix >>> import numpy as np >>> m = Matrix(10) >>> np.asarray(m) array([], shape=(0, 10), dtype=float32) >>> m.add_row() >>> a = np.asarray(m) >>> a[:] = 1 >>> m.add_row() >>> a = np.asarray(m) >>> a array([[ 1., 1., 1., 1., 1., 1., 1., 1., 1., 1.], [ 0., 0., 0., 0., 0., 0., 0., 0., 0., 0.]], dtype=float32) ``` 現在我們可以將`Matrix`視為 NumPy `ndarray`，并使用標準的 NumPy 操作修改其內容。 ## 記憶安全和參考計數到目前為止實施的`Matrix`類是不安全的。 `add_row`操作可以移動底層緩沖區，這會使數據上的任何 NumPy（或其他）視圖無效。如果您嘗試在`add_row`調用后訪問值，您將獲得過時的值或段錯誤。這就是`__releasebuffer__`的用武之地。我們可以為每個矩陣添加一個引用計數，并在視圖存在時鎖定它以進行變異。 ```py # distutils: language = c++ from cpython cimport Py_buffer from libcpp.vector cimport vector cdef class Matrix: cdef int view_count cdef Py_ssize_t ncols cdef vector[float] v # ... def __cinit__(self, Py_ssize_t ncols): self.ncols = ncols self.view_count = 0 def add_row(self): if self.view_count > 0: raise ValueError("can't add row while being viewed") self.v.resize(self.v.size() + self.ncols) def __getbuffer__(self, Py_buffer *buffer, int flags): # ... as before self.view_count += 1 def __releasebuffer__(self, Py_buffer *buffer): self.view_count -= 1 ``` ## 標志我們在代碼中跳過了一些輸入驗證。 `__getbuffer__`的`flags`參數來自`np.asarray`（和其他客戶端），是一個描述所請求數組類型的布爾標志的 OR。嚴格地說，如果標志包含`PyBUF_ND`，`PyBUF_SIMPLE`或`PyBUF_F_CONTIGUOUS`，`__getbuffer__`必須提高`BufferError`。這些宏可以是`cpython.buffer`的`cimport` .d。（矢量矩陣結構實際上符合`PyBUF_ND`，但這會阻止`__getbuffer__`填充步幅。單行矩陣是 F-連續的，但是更大的矩陣不是。） ## 參考文獻這里使用的緩沖接口在 [**PEP 3118** ](https://www.python.org/dev/peps/pep-3118)中列出，修改緩沖液方案。有關使用 C 語言的教程，請參閱 Jake Vanderplas 的博客 [Python 緩沖協議簡介](https://jakevdp.github.io/blog/2014/05/05/introduction-to-the-python-buffer-protocol/)。參考文檔可用于 [Python 3](https://docs.python.org/3/c-api/buffer.html) 和 [Python 2](https://docs.python.org/2.7/c-api/buffer.html) 。 Py2 文檔還描述了一個不再使用的舊緩沖區協議;自 Python 2.6 起， [**PEP 3118** ](https://www.python.org/dev/peps/pep-3118)協議已經實現，舊協議僅與遺留代碼相關。