Magic Method, on the wall, who, now, is the __fairest__ one of all?

illustrations illustrations illustrations illustrations illustrations illustrations

Magic Method, on the wall, who, now, is the __fairest__ one of all?

Published on May 19, 2017 by Sep Dehpour

Table Of Contents

Update May 22: Here is the video of the talk:

The Proposal 

This was my talk proposal for Pycon 2017 which got accepted. The proposal is slightly modified to match the final talk better. For example originally I was going to talk about writing a Redis Client too but I ended up removing that from the final talk. I will be giving this talk tomorrow on Saturday May 20th!

The code samples used in this talk can be found at: https://github.com/seperman/bad-ideas

Or simply: pip install bad-ideas

This talk is lightly inspired by my other talk: Harness the power of Python magic methods and lazy objects.

Summary 

Magic methods are a very powerful feature of Python and can open a whole new door for you. However, with great power comes great responsibility.

In this talk we explore magic method’s capabilities by first experimenting with recreating echo, grep, and pipe bash command syntaxes as valid Python syntaxes. And finally we learn about reference counting and the garbage collector by creating undeletable objects.

Once you see what magic methods can bring to the table; the limit is only your imagination!

Who and Why 

This talk is mainly geared towards novice Python developers and is about Python’s magic methods. However, the ideas brought up in the experiments can be very interesting for more experienced Python developers as well. The audience is expected to have limited or even no exposure to the magic methods.

The talk is expected to make the audience excited about what magic methods can bring to the table and demystify certain syntaxes that they might have seen and used in certain libraries, for example the Django queries or SQLAlchemy queries from chaining filters and operators.

Magic Method, on the wall, who, now, is the __fairest__ one of all? 

Intro 

1.5 min

What are Python’s magic methods?

  • Special methods that you can define to add “magic” to your classes.
  • They all look like: __something__.

Example: __init__

So what can we do with them?

Let’s do some experiments and learn by examples!

Disclaimer 

Magic methods are a very powerful feature of Python and can open a whole new door for you. However with great power comes great responsibility.

The following experiments are solely for educational purposes and NOT for production code.

Experiment: Type Less 

4.5 min

I don’t know about you but I don’t like typing too much. Every keystroke is a stress on your fingers and over time it adds up.

There are times that you need to type something like:

a = 10
a = a + 20

or as you know the shortcut that is:

a += 20

Since we are all about typing less. What if we just do:

a+20 and that does the job for us? We save one keystroke of =.

>>> a = 10
>>> a + 20
>>> print(a)
30

Let’s look at the full list of magic methods.

__add__ and __sub__ 

We can do that using __add__ and __sub__ magic methods.

class Num:

    def __init__(self, value):
        self.value = value

    def __add__(self, other):
        self.value += other
        return self.value

    def __sub__(self, other):
        self.value -= other
        return self.value

    def __repr__(self):
        return str(self.value)

    __str__ = __repr__
>>> a = Num(10)
>>> a
10
>>> a + 20
>>> a
30
>>> a - 5
>>> a
25

Yay, we removed the need to type = which saves a couple million keystrokes a year.

What do you think is gonna happen if we do:

20 + a

Oops! We get:

TypeError: unsupported operand type(s) for +: 'int' and 'Num'

Let’s look at __add__ again:

    def __add__(self, other):
        self.value += other
        return self.value

We are adding the “other” to the current value. When we run: a + 20 it runs a.__add__(a, 20) however when we do 20 + a it runs int.__add__(20, a) and then it freaks out!

__radd__ and __rsub__ 

That’s where the reversed add and sub come to play.

What happens is that when it runs int.__add__(20, a) and gets a TypeError, then it tries the reverse add which is a.__radd__(a, 20).

    def __rsub__(self, other):
        self.value = other - self.value
        return self.value

    __radd__ = __add__

Here is the full implementation:

class Num:

    def __init__(self, value):
        self.value = value

    def __add__(self, other):
        self.value += other
        return self.value

    def __sub__(self, other):
        self.value -= other
        return self.value

    def __rsub__(self, other):
        self.value = other - self.value
        return self.value

    def __repr__(self):
        return str(self.value)

    __str__ = __repr__
    __radd__ = __add__
>>> a = Num(10)
>>> a + 20
>>> a
30

>>> a - 5
>>> a
25

>>> 40 + a
>>> a
65

>>> 20 - a
>>> a
-45

Disclaimer: That’s a bad idea. Don’t try this at home. I mean at work.

Experiment: Print Filter 

2 min

Filter is a built-in function that does filtering on iterables.

Here is an example in Python 3:

foo = [1, 2, 3, 5, 6, 7]
bar = filter(lambda x: x % 3 == 0, foo)

We are filtering the list to have only elements that are divisible by 3.

If you print bar, what do you think you are gonna get?

>>> print(bar)
<filter at 0x119151d18>

That’s right, bar is a generator (in Python3).

How do you get the filtered list printed?

One way is to convert the generator to a list:

>>> print(list(bar))
[3, 6]

But sometimes that is too much work if you want to keep printing the object and you know you don’t want it as a generator when printing.

How can we modify the built-in filter so it converts itself into a list when printed?

Everything is an object 

Did you know that everything is an object in Python?

def func(x):
    previous_x = getattr(func, "_x", "Not set")
    print("new value: {}, previous value: {}".format(x, previous_x))
    func._x = x

>>> func(10)
new value: 10, previous value: Not set
>>> func(20)
new value: 20, previous value: 10
>>> func(30)
new value: 30, previous value: 20

Even built-in functions are object! Yes, even built-in functions. Let’s subclass the filter builtin function and add some __str__ and __repr__ to filter:

class Filter(filter):
    def __str__(self):
        return str(list(self))

    __repr__ = __str__


bar = Filter(lambda x: x % 3 == 0, foo)
print(bar)  # prints [3, 6]

Now you can use Filter instead of filter and printing will give you the filtered results. No worries!

Experiment: Echo to file 

1.5 min

Here is one of my favorites in bash:

echo "hello" >> foo.txt

And sometimes that syntax is too good not to use in Python.

The trick is that Python 2 used to have something like this:

print >> myfile, "Hello World!\n"
print >> myfile, "I want a burrito."

But you can’t do that in Py 3 anymore since print is a function now.

Hmm, what have we got for >> operator?

operator method
» Binary operation of __rshift__

Awesome! Lets get to work.

myfile = open("hello.txt", "w")

class Echo:
    def __init__(self, text):
        self.text = text

    def __rshift__(self, other):
        other.seek(0, 2)
        other.write(self.text)
# Writes to the end of the file!
>>> Echo("Hello World!\n") >> myfile
>>> Echo("I want a burrito.") >> myfile

Experiment: Pipe and Grep 

3.5 min

here is another favorite from bash: pipe and grep. I use it all the time.

command | grep something

First of all, what can we use for pipe | operator?

Let’s see.

operator method
` `

aha!

Let’s say we define a grep that uses a | (binary or) operator. This is one way we can define it:

class Grep:
    def __or__(self, other):
        ...

Note the self and other in the arguments. It means that in order to use | with this grep, it first needs to __init__ the grep and then do the pipe. Which means the order we write things are gonna be different than what we are used to see in bash:

instead of text | grep something, it is gonna be grep(something) | text. But wait a second. There is reverse or: __ror__ too. That can let us write text | grep(seomthing)!

class Grep:

    def __init__(self, item):
        self.item = item.lower()

    def thefilter(self, line):
        return self.item in line

    def __ror__(self, other):
        if isinstance(other, str):
            other = other.lower().split('\n')
        return Filter(self.thefilter, other)


lines = """
Whether you're new to programming or
an experienced developer, it's easy
to learn and use Python.
Checkout jobs.python.org
for Python jobs.
"""

>>> lines | Grep('Python')
['to learn and use Python.',
'checkout jobs.python.org',
'for python jobs.']

Awesome! You can even even chain the greps!

>>> found = lines | Grep('Python') | Grep('jobs')
>>> print(found)
['checkout jobs.python.org', 'for python jobs.']

Experiment: The undeletable 

5.5

Let’s say you run del obj. Normally that would delete the object but we want to make it undeletable!

>>> del obj
<obj: I'm still here. You CAN NOT delete me!>
>>> obj
<Yes I'm still here!>

Did you know what happens when you delete an object?

del obj

Garbage Collector 

Hmm, ok let’s review how deleting works in Python.

cPython specifically keeps track of number references to the object. This is called reference counting. When you do del obj, it sets the number of references to the object to zero.

Then the Garbage collector goes and deallocates the object. However if your object has a finalizer, then things can get tricky.

You might ask what is a finalizer? objects with finalizers are objects with a __del__ method and generators with a finally block.

And if your object has a finalizer, the garbage collector will run the finalizer only at that moment.

The important thing to keep in mind is that it is not guaranteed that __del__ will run immediately after you run del since it is up to the garbage collector to run the __del__. The __del__ might never run. So you can’t ever depend on it. But for the sake of this experiment, we will use __del__ into our advantage.

Again, when the __del__ is run, the reference count to the object has already been set to zero and it is literally removed from the name space that it existed before.

So how can we resurrect the object once the __del__ is running? Maybe we can raise some exception in __del__ so it can’t be successfully run and the GC aborts deleting it?

The answer is no. cPython will abort running __del__ but it will still deallocate the object.

Here is another idea: we still have access to self inside __del__. After all it is

def __del__(self):
   ...

How can we use this to our advantage? Maybe we put the object back in the name-space it was deleted from?

class Obj:

    def __del__(self):
        global obj
        obj = self
        print("You can't delete me!")

    def __str__(self):
        return "<obj:{}>".format(id(self))


>>> obj = Obj()

>>> print(obj)
<obj:123123>
>>> del obj
You can't delete me!
>>> print(obj)
<obj:123123>

Del didn’t delete the object! It is the same object with the same id!

So what happened here again?

  1. del obj sets the number of references to obj to zero and removes it from the globals name space in this case.
  2. The garbage collector tries to deallocate the object and free the memory.
  3. The garbage collector sees that the obj has __del__ finalizer method and runs it.
  4. We define a new global variable called obj. I know I cheated here and I already knew the object’s name was obj. There are ways to find the name but that would have made the code way longer. You can see the full version implemented here.
  5. We set the global obj to be self which is the current object that is being deleted.
  6. The object is resurrected! It is the same object with the same ID as before!

Pep 442 

Pep 442 was introduced in Python 3.4 and made some backward incompatible changes into how the finalizer is called by the garbage collector.

We were running the above code:

print(obj)
del obj
print(obj)

And in Python 3.4+ what we get is something like:

<obj:4482786192>
You can't delete me!
<obj:4482786192>

But in Python 2 to 3.3 you get:

<obj:4549435760>
You can't delete me!
<obj:4549435760>
You can't delete me!
You can't delete me!

Basically in Python 3.4+ __del__ methods will be executed at most once by the garbage collector and it will no longer matter whether an object with a finalizer is a part of cyclic trash.

More to read 

Now that you made it to here, there are a couple of articles I would recommend you to take a look at the following pages too to learn about the garbage collector and even other implementations of it:

Conclusion 

1 min

We explored magic method’s capabilities by first experimenting with

  • Making a number object so we type less
  • Subclassed filter so we print the filtered and see the results
  • Echo to file
  • Pipe and grep
  • Undeletable object

As we saw, the magic methods can bring a lot to the table; the limit is only your imagination!

Hope you learnt something from these bad ideas! https://github.com/seperman/bad-ideas

Don’t forget to pip install bad-ideas and play around with the code!

See Also

Diff It To Digg It

Diff It To Digg It

Anybody who has used git diff will know that your life is not the same once you start diffing. When you get to the habit, there is no going back. Now let’s look at diff for structured data!

Read More