Articles with the python tag

Airflow 并发trigger DAG 的问题

有一个小服务使用了airflow,会把比较耗时的离线任务丢到airflow 处理。 在需要的时候触发(trigger_dag)这些任务。最近发现一小部分任务没成功,对应的dag 下也没找到日志记录,看起来是没触发成功。 在airflow-web-server 中发现了异常日志:

Run id ... already exists for dag id ...


未指定run_id 的情况下,run_id 默认为utcnow,并且还默认去掉了microsecond。这样的话,对同一个dag 大量并发trigger 的时候,run_id 很容易相同。

临时的解决方法,在trigger dag 时指定下run_id。我临时改成了 utcnow+uuid4,应该不会再撞上了吧,不然就可以买彩票了。。。


接手一坨缩进、格式乱七八糟的Nginx 配置,简直要命。想起golang 有一个gofmt,动手做一个简单的ngxfmt。

Continue ->

Salt Minion ID 变为FQDN 记录的问题

最近在一台外网机器上起了salt minion ,但是同事发现/etc/salt/minion_id 不对,之前自动生成的minion_id 都是机器的/etc/hostname,这回变成了一个奇怪的域名:cncXXXX.XXX.ln.cn,并且这个域名和使用 hostname --all-fqdns 返回的结果相同。先查下minion_id 是怎么生成的。


When the minion is started, it will generate an id value, unless it has been generated on a previous run and cached (in /etc/salt/minion_id by default). This is the name by which the minion will attempt to authenticate to the master. The following steps are attempted, in order to try to find a value that is not localhost:

  1. The Python function socket.getfqdn() is run
  2. /etc/hostname is checked (non-Windows only)
  3. /etc/hosts (%WINDIR%\system32\drivers\etc\hosts on Windows hosts) is checked for hostnames that map to anything within

If none of the above are able to produce an id which is not localhost, then a sorted list of IP addresses on the minion (excluding any within is inspected. The first publicly-routable IP address is used, if there is one. Otherwise, the first privately-routable IP address is used.

If all else fails, then localhost is used as a fallback.

应该是第一点,通过socket.getfqdn 拿到的结果,也验证了和上文提到的hostname --all-fqdns 拿到的结果一样。


Continue ->