Pythonista. Skier. Husband. Speaker. @PythonGlasgow organiser. @OpenStack hacker at @RedHatOpen
5 stories
·
0 followers

pynd - search Python code

1 Share

For a while I have wanted a smarter grep1. A grep that understands Python syntax and idioms.

Over the last week or two I had a go at writing this and called it pynd (python find). As a very young project it is likely to change and evolve but I would love some feedback. Is this something you think you would use? What features would you like to see? Please send me your feedback.

I have spent lots of time grepping huge Python projects, pynd is starting to make that easier for me.

What can it do?

pynd is best demonstrated with some simple examples from the docs, ran against another project of mine.

Find and list all public functions

$ pynd --def --public
./retrace.py
181:def retry(*dargs, **dkwargs):
105:    def delay(self, attempt_number):
117:    def delay(self, attempt_number):
128:    def attempt(self, attempt):
140:    def attempt(self, attempt_number):
150:    def validate(self, result):
159:    def validate(self, result):
171:    def delay(self, attempt_number):
174:    def attempt(self, attempt_number):
177:    def validate(self, result):

or, with the --private flag.

$ pynd --private --def
./retrace.py
49:    def _update_wrapper(wrapper, wrapped,
67:    def _wraps(wrapped, assigned=functools.WRAPPER_ASSIGNMENTS,
233:    def _setup_limit(self, limit):
248:    def _setup_interval(self, interval):
263:    def _setup_validator(self, validator):
276:    def _nice_name(self, thing):

Look for each time a class instance was created

$ pynd --class Interval --call
./retrace.py
100:class Interval(_BaseAction):
251:            self._interval = Interval()

Search only within docstrings

$ pynd --doc "decorator" --ignore-case
./retrace.py
181:def retry(*dargs, **dkwargs):
The retry decorator. Can be passed all the arguments that are accepted by
Retry.__init__.
210:class Retry(object):
The Retry decorator class.

This class handles the retry process, calling wither limiters or interval
objects which control the retry flow.

Everything else

Check out pynd --help

$ pynd --help
usage: pynd [-h] [--version] [--ignore-dir [IGNORE_DIR [IGNORE_DIR ...]]]
            [--ignore-case] [--files-with-matches] [--show-stats]
            [--public | --private] [--verbose | --debug] [-d] [-c] [-f] [-i]
            [-C] [-a]
            [PATTERN] [FILES OR DIRECTORIES [FILES OR DIRECTORIES ...]]

Search for PATTERN in each Python file in filesystem from the current
directory down. If any files or directories are specified then only those are
checked.

positional arguments:
  PATTERN               The pattern to match against. This must be a valid
                        Python regular expression.
  FILES OR DIRECTORIES  A file or directory to limit the search scope. This
                        can be provided multiple times.

optional arguments:
  -h, --help            show this help message and exit
  --version             show program's version number and exit
  --ignore-dir [IGNORE_DIR [IGNORE_DIR ...]]
                        A pattern to exclude directories. This must be a valid
                        Python regular expression. It can be provided multiple
                        times.
  --ignore-case         Make all the regular expression matching case
                        insesitive.
  --files-with-matches  Don't output all the results, just the paths to files
                        that contain a result.
  --show-stats          At the end, show some stats.
  --public              Only show results considered to be public in Python.
                        They don't start with an underscore.
  --private             Only show results considered to be private in Python.
                        They start with an underscore.
  --verbose             Explain what is happening.
  --debug               Output excessively to make debugging easier
  -d, --doc             Match class and function docstrings.
  -c, --class           Match class names.
  -f, --def             Match function names.
  -i, --import          Match imported package names.
  -C, --call            Match call statements.
  -a, --attr            Match attributes on objects

What next?

That was just a super quick tour of some of the features. Now I just need to settle in and use the project for a while and hopefully find a few others to do the same. Then we can see what to do next.

Get involved.

See the contributing docs and the roadmap or just join in on Github.


  1. I actually use ack, but grep is a better catch-all term. 

Read the whole story
d0ugal
1042 days ago
reply
Glasgow, Scotland
Share this story
Delete

Create an Excellent Python Dev Env

1 Share

There are a huge number of Python dev tools around, a number of them are essential for my day to day development. However, they tend to suffer from a lack of discoverability and it takes a while to find what works for you.

I'm going to quickly share what I use, some of these are well known, some are probably not. I'd expect most people to pick and choose from this post as you are unlikely to want everything I use, but there should be something useful for most people.

These are my primary goals:

  • Be able to install any Python version easily.
  • Don't ever touch the system Python.
  • An easy way to setup virtualenvs for specific projects.
  • Install and isolate a number of Python tools.

How do we get there?

pyenv

pyenv pitches itself as "simple python version management" and it does just that. Once setup, you can easily install and switch between Python versions, including specific point releases. pyenv install --list reveals it knows how to install a whopping 271 different Python versions at the moment from cpython 2.1.3 up to 3.7-dev and pypy and stackless.

The install process is a bit manual, but there is an install tool that makes it easier. After installing, I do something like this:

pyenv install -s 2.7.12;
pyenv install -s 3.5.2;
pyenv install -s 3.4.5;
pyenv install -s pypy-5.4.1;
pyenv global 2.7.12 3.5.2 3.4.5 pypy-5.4.1;

This installs the Python versions I typically need, and then sets them as the global default. The order is important, 2.7.12 becomes the default for python as it is first and 3.5.2 becomes the default for python3.

If you just want to use a specific Python version in a directory, and it's subdirectories, you can run the command pyenv local 3.5.2 and it will create a .python-version file. Warning, if you do this in your home directory by mistake it can be very confusing.

One feature I'd love pyenv to have, is a way to tell it to install a Python version (like 2.7 or 3.5) and have it automatically install the latest point release (and add a new command that removes and updates them when needed)

pyenv-virtualenv

For a long time I was a big user of virtualenvwrapper, however, my transition to pyenv and fish caused some issues. I stumbled on pyenv-virtualenv (not to be mistaken with pyenv-virtualenvwrapper which also doesn't support fish) which covers all my needs. I wrote a few fish functions to make it a little easier to use. It isn't hard, but maybe just a little verbose.

For example, here is a handy way to make a temporary virtualenv, I found this feature of virtualenvwrapper (the mktmpenv command) particularly useful.

function venv-tmp
  set venv_tmp_name "tmp-"(random)
  pyenv virtualenv (expr substr (python --version 2>&1) 8 20) $venv_tmp_name
  venv-activate $venv_tmp_name
end

function venv-tmp-cleanup
  for val in (pyenv versions | grep "/envs/tmp-")
    venv-rm (basename $val)
  end
end

Generally it doesn't give me much over what virtualenvwrapper did (other than fish support) but I do like that it is managed by pyenv and integrates well.

pipsi

pipsi is a more recent addition to my setup. It is a fairy simple tool which allows you to install Python CLI tools in their own virtualenv and then the command is added to your path. The main advantage here is that they are all isolated and don't need to have compatible requirements. Uninstalling is also much cleaner and easier - you just delete the virtualenv.

I install a bunch of Python projects this way, here are some of the most useful.

  • tox: My defacto way of running tests.
  • mkdocs: A beautifully simple documentation tool (I might be biased).
  • git-review: The git review command for gerrit integration.
  • flake8: Python linting, mostly installs like this for vim.

Putting it all together

So, overall I don't actually use that many projects, but I have very happy with how it works. I have the setup automated, and it looks like this.

# pyenv
if [ ! -d ~.pyenv ]; then
    curl -L https://raw.githubusercontent.com/yyuu/pyenv-installer/master/bin/pyenv-installer | bash
    git clone https://github.com/yyuu/pyenv-virtualenv.git ~/.pyenv/plugins/pyenv-virtualenv
else
    pyenv update
fi;

pyenv install -s 2.7.12;
pyenv install -s 3.5.2;
pyenv install -s 3.4.5;
pyenv install -s pypy-5.4.1;
pyenv global 2.7.12 3.5.2 3.4.5 pypy-5.4.1;
~/.pyenv/shims/pip install -U pip pipsi
rm -rf ~/.local/venvs
~/.pyenv/shims/pipsi install tox
~/.pyenv/shims/pipsi install mkdocs
~/.pyenv/shims/pipsi install git-review
~/.pyenv/shims/pipsi install 1pass
~/.pyenv/shims/pipsi install flake8
~/.pyenv/shims/pipsi install yaql
~/.pyenv/shims/pipsi install livereload

The summary is, first install pyenv and setup the Python versions you need. Then install pipsi into the default pyenv environment and use that to install the other tools. The system Python should never be touched.

A couple of things are missing as you'll need to setup paths and so on, so please do look at the install guides for each.

Read the whole story
d0ugal
1094 days ago
reply
Glasgow, Scotland
Share this story
Delete

Debugging Mistral in TripleO

1 Share

During the OpenStack Newton cylce the TripleO project has started to use Mistral, the OpenStack Workflow service. This has allowed us to provide an interface to TripleO that is used by the CLI and the new GUI (and potentially other consumes in the future).

We naturally had to debug a few problems along the way. We'll go through the steps to track down the issue.

Mistal Primer

In TripleO, we call Mistral in two different ways - either by starting Workflows or directly calling Actions. Workflows are started by creating Executions, this then represents the running workflow. Running actions are represented by Action Executions. Since Workflows are typically are made up of a number of action calls (or sub-workflow calls) this means start a Workflow will start one or more Executions and one or more Action Executions.

Unfortunately it isn't always clear if you are calling a workflow or action directly. So, first things first, what is happening in Mistral? Let's list the workflow executions and the action executions.

$ openstack workflow execution list
$ openstack action execution list
# or
$ mistal execution-list
$ mistal action-execution-list

These commands can be generally useful if you are waiting for something to finish and want to look for signs of progress.

The most important columns to pay attention to are the "Workflow name" and "State" in both. Then then "State info" in the execution list and the "Task name" in the action execution list.

Finding the error

Okay, so something has gone wrong. Maybe you have an error that mentions Mistral or you noticed an error state in either the execution list or action execution list.

First check the executions. If you have a Workflow in error state, then you often want to look at the action executions unless there is an error in the workflow itself. The output here should give us enough information to tell if the workflow is a problem or one of the actions.

mistral execution-list | grep "ERROR";
# Grab the execution ID from above.
mistral execution-get $EXECUTION_ID;
mistral execution-get-output $EXECUTION_ID;

Then check the actions. Often these are more useful to look at, but you first want to know which workflow execution you are debugging.

# Also look at the actions
mistral action-execution-list;
mistral action-execution-get-output $ACTION_ID;

NOTE: Sometimes an action execution is in the ERROR state, but that is expected. For example, in some workflows we check if a swift container exists and it is an "ERROR" if it doesn't, but it just changes the Workflow logic.

Hopefully this gives you some idea what is going on, but you may need to look into the Logs fo the full traceback...

Logs

The Mistral log is very detailed and useful for in depth debugging. To follow it and look for messages from the TripleO actions, or ERROR's I find this very useful.

tail -f /var/log/mistral/mistral-server.log | grep "ERROR\|tripleo_common";

Common-ish Problems

A couple of problems I've seen a few times and how they can be spotted.

  • "Error response from Zaqar. Code: 503. Title: Service temporarily unavailable. Description: Messages could not be enqueued. Please try again in a few seconds.."

Sometimes workflows will fail when sending messages to Zaqar, this is how the result of a workflow is reported. Unfortunately this is hard to debug. You can usually safely retry the full workflow, or retry the individual task.

mistral task-list;
# Find the ID for the failed task.
mistral task-rerun $ID;

Hopefully we can resolve this issue: https://bugs.launchpad.net/tripleo/+bug/1626103

  • Another problem? I shall add to this as they come up!
Read the whole story
d0ugal
1147 days ago
reply
Glasgow, Scotland
Share this story
Delete

Retrace - Configurable, elegant retrying

1 Share

After I mentioned Retrace here recently a few people have asked me about it, so I thought I'd write up a quick post.

Retrace is, essentially, a retry decorator. There are many projects like this but after trying a few I either ran into issues or found the API cumbersome. I wanted something with a really simple interface but with the ability to easily build on it.

Simplicity

By default, it will retry the function on any exception which subclasses Exception. Any exceptions that directly inherit from BaseException (like KeyboardInterrupt) wont be caught by default, as you generally don't want that.

import retrace

@retrace.retry
def unstable():
    # ...

Of course, you can change what you catch by passing on_exception which can be any valid exception class.

import retrace

@retrace.retry(on_exeption=IOError)
def unstable():
    # ...

Portability

Retrace is tested and supported on Python 2.7, 3.3, 3.4 and 3.5. It is also designed to be easily vendorable, I understand that you might not want, or be able to include a dependency for such a small utility. so you can easily just grab the retrace.py file and include it in your project.

Customisation

Retrace supports limiters, intervals and validators. These are all fairly similar concepts, but play a different role. We will quickly take a look at each of these.

Limiters

A limiter defines how many times the function should be retried. This can be passed in as either a int or a callable.

For example, retry a maximum of 10 times.

@retrace.retry(limit=10)
def unstable():
    # ...

If a callable is passed in, the number of retries can be limited easily with any custom logic.

import random
import retrace

def random_limit(attempt):
    if attempt > random.randint(0, 10):
        raise retrace.LimitReached()

@retrace.retry(limit=random_limit)
def unstable():
    # ...

Intervals

Intervals define the delay that is introduced between attempts. This can either be passed in as an int (which will then be the number of seconds) or as a callable.

Delay for 1 second between attempts.

@retrace.retry(interval=1)
def unstable():
    # ...

Delay for n seconds, where n is the current number of attempts. This then gradually increases the delay by one second each try.

This works because time.sleep is a callable and we pass in the current attempt number each time.

import time

@retrace.retry(interval=time.sleep)
def unstable():
    # ...

Validators

Validators are used to verify that the result from the function passes a check.

If it isn't a callable, it can be any object that is then compared with the result. Check that the function returns the value "EXPECTED".

@retrace.retry(validator="EXPECTED")
def unstable():
    # ...

If it is a callable, it will be passed the result and it should return True it has passed and False if it has failed and the function should be called again.

def validate_string(value):
    return isinstance(value, str)

@retrace.retry(validator=validate_string)
def unstable():
    # ...

It's a small project, but I find it useful fairly frequently. If this is something that interests you I would really like your feedback. How could it be made better? What do you need that I have not through of? Please send me your ideas!

Read the whole story
d0ugal
1147 days ago
reply
Glasgow, Scotland
Share this story
Delete

Automate publishing to PyPI with pbr and Travis

1 Share

Releasing new versions of your Open Source projects is a time demanding task. So in a recent, and admittedly small, project I decided to try and make it as easy as possible.

I started using pbr and Travis to automatically deploy to PyPI each time I push a git tag. This means, creating a new release is as simple as git tag 1.0.0 and git push origin --tags. Anyone with commit access can then easily and confidently roll a release.

Let's break that down a bit.

pbr

Python Build Reasonableness is a library for making packaging easier and more consistent. If you are happy to use it's conventions, which are mostly reasonable, it works very well. The project is developed under the OpenStack umbrella and used on most1 OpenStack project. That is how I was first introduced to it, but it works well outside of this ecosystem.

To use pbr, you need to add it to your setup.py, the project encourages using a setup.cfg, so your full setup.py should look like this.

#!/usr/bin/env python

from setuptools import setup

setup(
    setup_requires=['pbr>=1.9', 'setuptools>=17.1'],
    pbr=True,
)

For retrace I had to make a few extra additions because it is packaging up a single file, rather than a directory. After you have done this, much of the familiar details from a setup.py are added to a setup.cfg. This is most of the retrace setup.cfg.

[metadata]
name = retrace
author = Dougal Matthews
author-email = dougal@dougalmatthews.com
summary = Configurable retrying.
description-file = README.rst
home-page = https://github.com/d0ugal/retrace
classifier =
    Intended Audience :: Developers'
    Natural Language :: English'
    Programming Language :: Python'
    Programming Language :: Python :: 2.7'
    Programming Language :: Python :: 3.4'
    Programming Language :: Python :: 3.5'

[files]
packages =
    retrace

[wheel]
universal = 1

One of the really useful features in pbr is that it automatically versions your projects. It uses git tags for this, so if you tag something 1.0.0, then the version published to PyPI etc. will be 1.0.0. Then if you do some commits (lets say 5) and pip install then you will install 1.0.0.dev5. Tagging a 1.0.1 then versions your bug fix release and resets the dev counter.

It's really simple and removed the need to manually bump versions, which is one of the manual steps that can easily be messed up.

Travis

Hopefully most people know Travis, it is CI-aaS, Continuous Integration as a Service. They have built a really great product which they make free for Open Source projects.

The most common Travis use case is probably for running tests and code linting, which are both fantastic reasons, but there are compelling reasons to have Travis automatically do deploys. Running the deploy on different local machines increases the chances of picking up something that wasn't committed to git. If you do this on a CI server, the deploy should always be done from a clean and consistent environment.

This is most of the retrace .travis.yml config file, trimmed down a little to keep it short, but it should be functional. The deploy block is the most interesting section.

language: python
python: '3.5'
env:
- TOXENV=py35
- TOXENV=flake8
- TOXENV=docs
install:
- pip install tox
script:
- tox
deploy:
  provider: pypi
  user: d0ugal-deploy
  distributions: sdist bdist_wheel
  password:
    secure: b4f6y1xw5B/RXXnOu6JIaNcgOBZ0/CkNaMeEXsoQSewYZNwobLPYALY9WaaOblarwrVa5NRD3e4x6SoL1/1NzQxfhCNMn7L82sssmtevnK+mSuUp4IZQa8WKyz+xLfnk28TlHgQbctAU9NaeQ6GuEflTRD7Bp8+xJ1C7h+yBUnw=
  on:
    tags: true
    repo: d0ugal/retrace
    condition: "$TOXENV = py35"

First we specify the provider as PyPI, then the PyPI user is set2. We specify both source and wheel distributions. The password is encrypted and added with travis encrypt --add deploy.password and then some conditions are set. We only want to deploy tags, the source repository and for a specific TOXENV (otherwise the deploy will be attempted for each env).

Travis will then use twine to upload to PyPI, which uses HTTPS, even on older Python versions.

Updating documentation

As a bonus, this is a nice extra touch with Travis. You can easily have your documentation automatically built and published. In retrace the documentation is a small MkDocs project. With the following in your Travis config you can easily publish your documentation on every commit.

after_success:
  - git config user.name "Dougal Matthews";
  - git config user.email "dougal@dougalmatthews.com";
  - git remote add gh-token "https://${GH_TOKEN}@github.com/d0ugal/retrace.git";
  - git fetch gh-token && git fetch gh-token gh-pages:gh-pages;
  - pip install -IU -r test-requirements.txt .;
  - mkdocs gh-deploy -v --clean --remote-name gh-token;

You may want to move this to only happen on a deploy, but for a small project like retrace there is typically a release after any changes that would require docs updates.


Overall, I am very happy about how this came out. I would like to roll out a similar strategy to other projects that I maintain. The change isn't earth shatteringly huge, but it certainly greases the wheels a bit and makes things easier.


  1. And quite possibly every OpenStack project, but there are quite a few, so I don't plan on checking! 

  2. I have a dedicated user for automated deploys because I am paranoid about including my encrypted password. 

Read the whole story
d0ugal
1147 days ago
reply
Glasgow, Scotland
Share this story
Delete