各种疑难杂症（持续更新）

JNI 多线程访问共享对象之方案

参考资料：

ElasticSearch 过滤后聚合结果之方案

参考资料：

PHP 报错：`Can't use function return value in write context`

原因：

PHP 5.5 以下的版本，只能向 empty() 函数传递变量，不能传递引用（函数等）

MySQL 客户端连接本地服务器时无法指定端口

当目标地址为 localhost（默认）时，MySQL 客户端使用 unix socket 进行连接，需改为 127.0.0.1

参考资料：

https://serverfault.com/questions/306421/why-does-the-mysql-command-line-tool-ignore-the-port-parameter

Docker for Windows 无法监听 2375 端口

2375 在 Windows 的保留范围内，使用前需将其排除

参考资料：

https://github.com/docker/for-win/issues/3546

Windows 安装 mysqlclient 时失败

安装 Visual C++ Build Tools 2015、MySQL Connector C 6.1、MariaDB connector

将 MariaDB 安装目录下相应的文件复制到：

C:\Program Files (x86)\MySQL\MySQL Connector C 6.1\include\mariadb
C:\Program Files (x86)\MySQL\MySQL Connector C 6.1\lib\mariadb

参考资料：

https://devblogs.microsoft.com/python/unable-to-find-vcvarsall-bat/
https://stackoverflow.com/questions/51294268/pip-install-mysqlclient-returns-fatal-error-c1083-cannot-open-file-mysql-h

docker-compose 中的服务需要等待其他服务启动完成后再执行

使用 docker-compose 的 healthchecks 命令

参考资料：

https://stackoverflow.com/questions/31746182/docker-compose-wait-for-container-x-before-starting-y/41854997#41854997

Celery Task 中无法创建子进程

解决方案：

使用 billiard.context.Process 代替 multiprocessing.Process

参考资料：

https://github.com/celery/celery/issues/4551

Celery ResultSet调用设置了callback的get方法时阻塞

原因：

ResultSet.on_ready为一个vine.promise对象，未调用前会一直轮询结果状态

解决方案（选其一）：

改用由group()返回的GroupResult
传入on_interval参数：lambda: rs._on_ready() if rs.ready() else None

在 Windows 运行 celery task 报错：`ValueError: not enough values to unpack (expected 3, got 0)`

celery 4.x 后不再支持 windows，问题与底层的 billiard 相关（拖了 2 年，曾有人提过修复的 PR，不过作者似乎并不愿意 “修复”）

解决方案（选其一）：

设置环境变量：set FORKED_BY_MULTIPROCESSING=1
使用 - P eventlet/gevent/solo 运行 celery

参考资料：

https://www.distributedpython.com/2018/08/21/celery-4-windows/
https://stackoverflow.com/questions/37255548/how-to-run-celery-on-windows/47331438
https://github.com/celery/celery/issues/4081
https://github.com/celery/celery/issues/4178

删除 git 仓库中的 submodule

git 本身没有提供相应的命令，可通过以下步骤手动实现：

git submodule deinit <path_to_submodule>
git rm --cached <path_to_submodule>
git commit-m "Removed submodule"

参考资料：

https://gist.github.com/myusuf3/7f645819ded92bda6677

修改 git 仓库中 submodule 的 remote url

修改 .gitmodules 文件中对应模块的 url 属性

使用 git submodule sync 命令，将新的 url 更新到文件 .git/config

参考资料：

https://www.jianshu.com/p/ed0cb6c75e25

`unitest.patch()` 无法多次 patch 同一对象

背景：

在 test1 中对导入 func_module.func() 进行测试，func() 引用了 moduleA，对 moduleA 进行 patch 并返回 Mock1，执行 func()

在 test2 中同样导入 func_module.func() 进行测试，func() 引用了 moduleA，重新对对 moduleA 进行 patch 并返回 Mock2，执行 func()

此时 test2 中 func() 内是用的 moduleA 对象是 test1 中的 Mock1，会导致对 Mock2 做的断言失败

原因：

运行 test1 时，对 moduleA 进行了 patch，mock_moduleA 由 func_module 导入并被 func() 使用。运行 test2 时，重新对 moduleA 进行 patch，而由于当前上下文中已存在 moduleA 的导入对象了（test1 时导入的 mock_moduleA），因此解释器没有重新导入，直接使用了 test1 时的 mock_moduleA。究其根本原因，应该是 pytest 对每个 test 的隔离不够充分，使得 test1 中的模块导入链影响了 test2

解决方案：

应改为对 func_module.moduleA 进行 patch（被 func() 直接使用），而非 patch 全局的 moduleA（被 func_module 导入）

参考资料：

https://stackoverflow.com/questions/5341147/how-to-patch-a-modules-internal-functions-with-mock

Pyppeteer 无法执行 `setCookie()`

执行 setCookie() 时需为参数指定对应的 url

参考资料：

https://github.com/miyakogi/pyppeteer/issues/94

Pyppeteer 访问部分站点导致与 chromium 的 websockets 连接关闭（错误码 1006）

原因：

是 chromium 的 websockets 服务端的问题所导致的，websockets>6 会有一个 ping-pong 的超时判断，若出现超时则会关闭连接

解决方案（选其一）：

在 pyppeteer/connection.py 第 44 行为 websockets.client.connect() 增加 ping_interval=None, ping_timeout=None 参数（websockets>6）
将 websockets 降级至 6.0

参考资料：

https://github.com/miyakogi/pyppeteer/issues/62
https://github.com/miyakogi/pyppeteer/pull/160
https://bugs.chromium.org/p/chromium/issues/detail?id=865002

Pyppeteer 在 scrapy 中同步执行，无法充分利用其异步特性

pyppeteer 的并发特性是由 asyncio 以及 Python 3.6 提供的 async、await 关键字支持的，而 scrapy 的并发特性基于 twisted 框架，在使用中需手动做一些调整以让其兼容 asyncio

参考资料：

https://github.com/lopuhin/scrapy-pyppeteer
https://www.lizenghai.com/archives/24943.html

使用 winetricks 安装中文应用时，安装fakechinese后中文仍显示为方块

将 LANG 设置为 zh-CN 后再运行 winetricks：

env LANG=zh_CN.UTF-8 winetricks

参考资料：

http://linux-wiki.cn/wiki/zh-hans/Wine%E7%9A%84%E4%B8%AD%E6%96%87%E6%98%BE%E7%A4%BA%E4%B8%8E%E5%AD%97%E4%BD%93%E8%AE%BE%E7%BD%AE
https://blog.csdn.net/mhlwsk/article/details/51919916

从 PriorityQueue 获取元素时有一定概率触发 TypeError

背景：

在 celery-dispatcher 项目中使用了 PriorityQueue 来缓存 AsyncResult，并基于时间来分配优先级，该优先级会随着每次轮询不断调整：

# 缓存 AsyncResult
priority = time.time()
result = pickle.loads(result)
priority_queue.put((priority, result))

# 获取 AsyncResult，并调整优先级
priority, result = result_queue.get()
...
priority += poll_timeout
priority_queue.put((priority, result))

在实际运行过程中，发现有一定概率会触发以下异常，始终无法稳定复现：

TypeError: '<' not supported between instances of 'AsyncResult' and 'AsyncResult'

原因：

PriorityQueue 是通过 sorted(list(entries))[0]) 的方式获取元素的，若传入的元素为 (priority_number, data)，则在 sorted() 函数排序时会依次比较 priority_number、data。若此时 priority_number 相同而 data 不可比较，则会引发 TypeError

上述 priority 会随着每次轮询不断调整，有一定概率出现相同 priority 的元素，导致 PriorityQueue 开始比较 result，从而引发了上述异常

解决方案：

对于 Python>=3.7，可使用官方解决方案，将 PriorityQueue 中的元素封装为 PrioritizedItem

对于 Python<=3.6，可使用自行定义 PrioritizedItem：

@total_ordering
class PrioritizedItem:
    def __init__(self, priority, item):
        self.priority = priority
        self.item = item

    def __lt__(self, other):
        return self.priority < other.priority

    def __eq__(self, other):
        return self.priority == other.priority

参考资料：

https://stackoverflow.com/questions/54027861/using-queue-priorityqueue-not-caring-about-comparisons
https://docs.python.org/3.7/library/queue.html#queue.PriorityQueue

在 Django post_save 中下发 Celery 任务后 worker 出现 DoesNotExist 报错

背景：

在 Django 对象的 post_save signal 中下发了一个 Celery 任务并传入相应的对象 ID，从而在异步地对该对象进行处理。然而在 Celery 任务中获取该对象时出现 DoesNotExist 报错

解决方案：

通过以下方式下发 Celery 任务，以便在事务提交后再执行：

from django.db import transaction
transaction.on_commit(lambda: handle_save_task.apply_async(args=[instance.pk]))

参考资料：

https://stackoverflow.com/questions/45276828/handle-post-save-signal-in-celery

本作品采用知识共享署名-非商业性使用-相同方式共享 4.0 国际许可协议进行许可。

用于记录开发与运维过程中遇到的一些坑，希望可以帮助到有需要的人

JNI 多线程访问共享对象之方案

ElasticSearch 过滤后聚合结果之方案

PHP 报错：`Can't use function return value in write context`

MySQL 客户端连接本地服务器时无法指定端口

Docker for Windows 无法监听 2375 端口

Windows 安装 mysqlclient 时失败

docker-compose 中的服务需要等待其他服务启动完成后再执行

Celery Task 中无法创建子进程

Celery ResultSet调用设置了callback的get方法时阻塞

在 Windows 运行 celery task 报错：`ValueError: not enough values to unpack (expected 3, got 0)`

删除 git 仓库中的 submodule

修改 git 仓库中 submodule 的 remote url

`unitest.patch()` 无法多次 patch 同一对象

Pyppeteer 无法执行 `setCookie()`

Pyppeteer 访问部分站点导致与 chromium 的 websockets 连接关闭（错误码 1006）

Pyppeteer 在 scrapy 中同步执行，无法充分利用其异步特性

使用 winetricks 安装中文应用时，安装fakechinese后中文仍显示为方块

从 PriorityQueue 获取元素时有一定概率触发 TypeError

在 Django post_save 中下发 Celery 任务后 worker 出现 DoesNotExist 报错

FEATURED TAGS

FRIENDS

JNI 多线程访问共享对象之方案

ElasticSearch 过滤后聚合结果之方案

PHP 报错：Can't use function return value in write context

MySQL 客户端连接本地服务器时无法指定端口

Docker for Windows 无法监听 2375 端口

Windows 安装 mysqlclient 时失败

docker-compose 中的服务需要等待其他服务启动完成后再执行

Celery Task 中无法创建子进程

Celery ResultSet调用设置了callback的get方法时阻塞

在 Windows 运行 celery task 报错：ValueError: not enough values to unpack (expected 3, got 0)

删除 git 仓库中的 submodule

修改 git 仓库中 submodule 的 remote url

unitest.patch() 无法多次 patch 同一对象

Pyppeteer 无法执行 setCookie()

Pyppeteer 访问部分站点导致与 chromium 的 websockets 连接关闭（错误码 1006）

Pyppeteer 在 scrapy 中同步执行，无法充分利用其异步特性

使用 winetricks 安装中文应用时，安装fakechinese后中文仍显示为方块

从 PriorityQueue 获取元素时有一定概率触发 TypeError

在 Django post_save 中下发 Celery 任务后 worker 出现 DoesNotExist 报错

FEATURED TAGS

FRIENDS

PHP 报错：`Can't use function return value in write context`

在 Windows 运行 celery task 报错：`ValueError: not enough values to unpack (expected 3, got 0)`

`unitest.patch()` 无法多次 patch 同一对象

Pyppeteer 无法执行 `setCookie()`