制作第三方包

PyPi wheel 制作

PyPi wheel 非常容易制作, 只需要正确的书写 setup.py 文件即可,之后运行下述命令:

python3 setup.py sdist bdist_wheel

然后, *.tar.gz.whl 就会暴露在 ./dist 目录下。

能否正确书写 setup.py 决定了包制作过程的成败, setup.py 样例如下所示:

from distutils.extension import Extension
from pathlib import Path

from Cython.Build import cythonize
from setuptools import setup, find_packages

setup(
    name='gvasp',
    description='A quick post-process for resolve or assistant the VASP calculations',
    author='hui_zhou',
    long_description=Path("./README.md").read_text(),
    long_description_content_type='text/markdown',
    version='0.0.1',
    license='GPL-3.0',
    python_requires='>=3.9',
    packages=find_packages(),
    install_requires=[
        'Cython',
        'lxml',
        'matplotlib',
        'numpy',
        'pandas',
        'pymatgen',
        'pymatgen-analysis-diffusion',
        'pyyaml',
        'scipy'],
    ext_modules=cythonize([Extension(name='gvasp.lib._dos', sources=['extension/_dos/_dos.pyx']),
                           Extension(name='gvasp.lib._file', sources=['extension/_file/_file.cpp',
                                                                      'extension/_file/_lib.cpp'])], language_level=3),
    include_dirs=['/usr/lib/gcc/x86_64-linux-gnu/11/include', '/home/hzhou/anaconda3/include',
                  '/home/hzhou/anaconda3/Library/include'],
    include_package_data=True,
    package_data={"gvasp": ["*.json", "*.yaml", "INCAR", "pot.tgz"]},
    entry_points={'console_scripts': ['gvasp = gvasp.main:main']}
)

where

  • name, author, description, version and license are package metadata

  • long_description are long_description_content_type are related description in PyPi (can write only once, before upload the package)

  • packages will search the module (including the __init__.py) under . directory

  • install_requires is the dependency of the package, will installed when the package installed

  • ext_modules and include_dirs are related to the C/C++ extensions

  • package_data is the data you want to including (which under the module); for other data (not in module), can write MANIFEST.in to including them, like this

  • entry_points specify a alias gvasp to represent the python3 gvasp/main.py

In fact, generate the *.whl is the first step for Linux platform, because PyPi will check the tag of *.whl file, only *manylinux_* field in name can be accept according to PEP rules (PEP 513 (manylinux1), PEP 571 (manylinux2010), PEP 599 (manylinux2014) and PEP 600 (manylinux_x_y)). So one want to upload the package to PyPi should repair the wheel to have the manylinux field.

Luckily, by docker image and auditwheel tool, one can easily repair the wheel.

For example, following such steps:

  1. pull the docker image, i.e., manylinux_2_28_x86_64

docker pull quay.io/pypa/manylinux_2_28_x86_64
  1. start and attach a container

docker run -it quay.io/pypa/manylinux_2_28_x86_64 "/bin/bash"
  1. transfer the source code to docker container

docker cp local_path container_id:docker_path
  1. recompile the package and obtain the *.whl

$python3 setup.py bdist_wheel
  1. repair the *.whl

auditwheel repair *.whl

Finally, a new wheel with the manylinux field will occur in the wheelhouse directory.

Then you can upload the wheel to PyPi use such command:

twine upload dist/*

Conda 包制作

Relative PyPi package production, production of conda package is very disgusting!!! Because you will meat the dependency problem every where.

Although, the conda package actually only need write meta.yaml and build.sh (at least for me), like this:

package:
  name: gvasp
  version: 0.0.1

source:
  path: .

requirements:
  build:
    - {{ compiler('c') }}
    - {{ compiler('cxx') }}

  host:
    - python
    - Cython
    - setuptools

  run:
    - python
    - numpy
    - Cython
    - lxml
    - matplotlib
    - pandas
    - pymatgen
    - pymatgen-analysis-diffusion
    - pyyaml
    - scipy

about:
  home: https://github.com/Rasic2/gvasp
  license: GPL-3.0

and this:

export CFLAGS="${CFLAGS} -isysroot ${CONDA_BUILD_SYSROOT}"
export CXXFLAGS="${CXXFLAGS} -isysroot ${CONDA_BUILD_SYSROOT}"
$PYTHON -m pip install . --no-deps -vv

Firstly, we talk about the meta.yaml.

  • package section represents the package information

  • source section manage how to get the package (git, pypi, local or other), here we use local (we suggest that you mkdir a new directory (like conda), and put the necessary source and data in there, including meta.yaml and the bash.sh below)

  • requirements is very very disgusting, because they have three different part, i.e., build, host, run.

    • build represents the system infrastructure, so you can put revision control systems (Git, SVN), make tools (GNU make, Autotool, CMake) and compilers (real cross, pseudo-cross, or native when not cross-compiling), and any source pre-processors there. For example, we put C/C++ compilers in this section.

    • host is responsible for the setup.py, in there, we use Cython, setuptools and inner module of python, so we put them in this section.

    • run is simple, only equal to the install_requires, (noted that pymatgen-* packages not in default channels, so we add the conda-forge as the optional)

    • Actually, in the package production, conda will make a new directory under the envs/**/conda-bld/package_name. Under the directory, three directory will be made, i.e. _build_env, _placeholder_placeholder_ and work, where the compiler in build section will download and installed in _build_env. The _placeholder_placeholder_ directory manage the conda environment, for example, it will install the python, setuptools, Cython here, basically same to a new conda environment. The work dir is the copy of your source code, and the real build work will happen here, for example, compile and package.

Then we can talk about the build.sh:

  • Bacause of use Cython, we redefine of the CFLAGS and CXXFLAGS, detailed information can see here.

  • env $PYTHON represents the python version in _placeholder_placeholder directory, don’t use the pure python command.

Here, we can use command below to process the real package production:

conda-build . -c conda-forge

. represents the directory including the meta.yaml and use the conda-forge channel because of pymatgen-* packages.

After that, in conda-bld/linux-64 directory, the package.tar.bz2 has been written (bin, info and lib directory in it).

Note

bin directory occur because we use entry_points; info directory store the recipes and metadata, lib is the real built package.

Finally, we can use Anaconda command to upload the package:

Anaconda upload *.tar.bz2

Install package can do this:

conda install -c hui_zhou -c conda-forge gvasp

Attention

When install the package, noticed that we used the compilers in conda-forge channel, so we particularly add this channel to install the package, otherwise conflicts will occur.

第三方包安装

  • 安装 conda 包

conda install package

重新安装同版本的包时,加入参数 --force-reinstall