5.19. OOP Object Constructor

__new__ will always get called when an object has to be created. There are some situations where __init__ will not get called. One example is when you unpickle objects from a pickle file, they will get allocated (__new__) but not initialised (__init__) 1.

In Object Oriented Programming constructor is:

  1. Special method

  2. Called automatically on object creation

  3. Can set instance attributes with initial values

  4. Works on not fully created object

  5. Method calls are not allowed (as of object is not ready)

  6. Returns None

Python __init__() method:

  1. Yes

  2. Yes

  3. Yes

  4. No

  5. No

  6. Yes

Python __new__() method:

  1. Yes

  2. Yes

  3. Yes (could be)

  4. Yes

  5. Yes (before instantiating) / No (after instantiating)

  6. No

In Python by definition both methods __new__() and __init__() combined and called consecutively are constructors. This is something which is not existing in other programming languages, hence programmers has problem with grasping this idea.

In most cases people will take their "experience" and "habits" from other languages, mixed with vogue knowledge about __new__() and call __init__() a constructor.

5.19.1. Example

>>> class Astronaut:
...     def __new__(cls, *args, **kwargs):
...         print('Before instantiating')
...         result = super().__new__(cls, *args, **kwargs)
...         print('After instantiating')
...         return result
...
...     def __init__(self):
...         print('Initializing')
>>>
>>>
>>> astro = Astronaut()
Before instantiating
After instantiating
Initializing

5.19.2. New Method

  • object constructor

  • solely for creating the object

  • cls as it's first parameter

  • when calling __new__() you actually don't have an instance yet, therefore no self exists at that moment

  • Source: 2

Called to create a new instance of class cls. __new__() is a static method (special-cased so you need not declare it as such) that takes the class of which an instance was requested as its first argument. The remaining arguments are those passed to the object constructor expression (the call to the class). The return value of __new__() should be the new object instance (usually an instance of cls).

Typical implementations create a new instance of the class by invoking the superclass's __new__() method using super().__new__(cls[, ...]) with appropriate arguments and then modifying the newly-created instance as necessary before returning it.

If __new__() is invoked during object construction and it returns an instance of cls, then the new instance's __init__() method will be invoked like __init__(self[, ...]), where self is the new instance and the remaining arguments are the same as were passed to the object constructor.

If __new__() does not return an instance of cls, then the new instance's __init__() method will not be invoked.

__new__() is intended mainly to allow subclasses of immutable types (like int, str, or tuple) to customize instance creation. It is also commonly overridden in custom metaclasses in order to customize class creation.

>>> class Astronaut:
...     def __new__(cls):
...         print('Constructing object')
...         return super().__new__(cls)
>>>
>>>
>>> astro = Astronaut()
Constructing object

5.19.3. Init Method

  • object initializer

  • for initializing object with initial values

  • self as it's first parameter

  • __init__() is called after __new__() and the instance is in place, so you can use self with it

  • it's purpose is just to alter the fresh state of the newly created instance

  • 3

Called after the instance has been created (by __new__()), but before it is returned to the caller. The arguments are those passed to the class constructor expression. If a base class has an __init__() method, the derived class's __init__() method, if any, must explicitly call it to ensure proper initialization of the base class part of the instance; for example: super().__init__([args...]).

Because __new__() and __init__() work together in constructing objects (__new__() to create it, and __init__() to customize it), no non-None value may be returned by __init__(); doing so will cause a TypeError to be raised at runtime.

>>> class Astronaut:
...     def __init__(self):
...         print('Initializing object')
>>>
>>>
>>> astro = Astronaut()
Initializing object

5.19.4. Return

>>> class Astronaut:
...     def __new__(cls):
...         print('Constructing object')
...         return super().__new__(cls)
...
...     def __init__(self):
...         print('Initializing object')
>>>
>>>
>>> astro = Astronaut()
Constructing object
Initializing object

Missing return from constructor. The instantiation is evaluated to None since we don't return anything from the constructor:

>>> class Astronaut:
...     def __new__(cls):
...         print('Constructing object')
...         super().__new__(cls)
...
...     def __init__(self):
...         print('Initializing object')  # -> is actually never called
>>>
>>>
>>> astro = Astronaut()
Constructing object
>>>
>>> type(astro)
<class 'NoneType'>

Return invalid from constructor:

>>> class Astronaut:
...     def __new__(cls):
...         return 'Mark Watney'
>>>
>>> astro = Astronaut()
>>>
>>> type(astro)
<class 'str'>
>>> astro
'Mark Watney'

Return invalid from initializer:

>>> class Astronaut:
...     def __init__(self):
...         return 'Mark Watney'
>>>
>>> astro = Astronaut()
Traceback (most recent call last):
TypeError: __init__() should return None, not 'str'

5.19.5. Do not trigger methods for user

  • It is better when user can choose a moment when call .connect() method

Let user to call method:

>>> class Server:
...     def __init__(self, host, username, password=None):
...         self.host = host
...         self.username = username
...         self.password = password
...         self.connect()    # Better ask user to ``connect()`` explicitly
...
...     def connect(self):
...         print(f'Logging to {self.host} using: {self.username}:{self.password}')
>>>
>>>
>>> connection = Server(
...     host='example.com',
...     username='admin',
...     password='myVoiceIsMyPassword')
Logging to example.com using: admin:myVoiceIsMyPassword

Let user to call method:

>>> class Server:
...     def __init__(self, host, username, password=None):
...         self.host = host
...         self.username = username
...         self.password = password
...
...     def connect(self):
...         print(f'Logging to {self.host} using: {self.username}:{self.password}')
>>>
>>>
>>> connection = Server(
...     host='example.com',
...     username='admin',
...     password='myVoiceIsMyPassword')
>>>
>>> connection.connect()
Logging to example.com using: admin:myVoiceIsMyPassword

However it is better to use self.set_position(position_x, position_y) than to set those values one by one and duplicate code. Imagine if there will be a condition boundary checking (for example for negative values):

>>> class PositionBad:
...     def __init__(self, position_x=0, position_y=0):
...         self.position_x = position_x
...         self.position_y = position_y
...
...     def set_position(self, x, y):
...         self.position_x = x
...         self.position_y = y
>>>
>>>
>>> class PositionGood:
...     def __init__(self, position_x=0, position_y=0):
...         self.set_position(position_x, position_y)
...
...     def set_position(self, x, y):
...         self.position_x = x
...         self.position_y = y
>>> class PositionBad:
...     def __init__(self, position_x=0, position_y=0):
...         self.position_x = min(1024, max(0, position_x))
...         self.position_y = min(1024, max(0, position_y))
...
...     def set_position(self, x, y):
...         self.position_x = min(1024, max(0, x))
...         self.position_y = min(1024, max(0, y))
>>>
>>>
>>> class PositionGood:
...     def __init__(self, position_x=0, position_y=0):
...         self.set_position(position_x, position_y)
...
...     def set_position(self, x, y):
...         self.position_x = min(1024, max(0, x))
...         self.position_y = min(1024, max(0, y))

5.19.6. Use Case - 0x01

  • Iris Factory

>>> from dataclasses import dataclass, field
>>>
>>>
>>> DATA = [
...     (5.8, 2.7, 5.1, 1.9, 'virginica'),
...     (5.1, 3.5, 1.4, 0.2, 'setosa'),
...     (5.7, 2.8, 4.1, 1.3, 'versicolor'),
...     (6.3, 2.9, 5.6, 1.8, 'virginica'),
...     (6.4, 3.2, 4.5, 1.5, 'versicolor'),
...     (4.7, 3.2, 1.3, 0.2, 'setosa')]
>>>
>>>
>>> @dataclass
... class Iris:
...     sepal_length: float
...     sepal_width: float
...     petal_length: float
...     petal_width: float
...     species: str = field(repr=False)
...
...     def __new__(cls, *args, **kwargs):
...         *measurements, species = args
...         clsname = species.capitalize()
...         cls = globals()[clsname]
...         return super().__new__(cls)
>>>
>>>
>>> class Setosa(Iris):
...     pass
>>>
>>> class Virginica(Iris):
...     pass
>>>
>>> class Versicolor(Iris):
...     pass
>>>
>>>
>>> result = [Iris(*row) for row in DATA]
>>> result  
[Virginica(sepal_length=5.8, sepal_width=2.7, petal_length=5.1, petal_width=1.9),
 Setosa(sepal_length=5.1, sepal_width=3.5, petal_length=1.4, petal_width=0.2),
 Versicolor(sepal_length=5.7, sepal_width=2.8, petal_length=4.1, petal_width=1.3),
 Virginica(sepal_length=6.3, sepal_width=2.9, petal_length=5.6, petal_width=1.8),
 Versicolor(sepal_length=6.4, sepal_width=3.2, petal_length=4.5, petal_width=1.5),
 Setosa(sepal_length=4.7, sepal_width=3.2, petal_length=1.3, petal_width=0.2)]

5.19.7. Use Case - 0x02

  • Path

Note, that this unfortunately does not work this way. Path() always returns PosixPath:

>>> from pathlib import Path
>>>
>>>
>>> Path('/etc/passwd')
PosixPath('/etc/passwd')
>>>
>>> Path('c:\\Users\\Admin\\myfile.txt')  
WindowsPath('c:\\Users\\Admin\\myfile.txt')
>>>
>>> Path(r'C:\Users\Admin\myfile.txt')  
WindowsPath('C:\\Users\\Admin\\myfile.txt')
>>>
>>> Path(r'C:/Users/Admin/myfile.txt')  
WindowsPath('C:/Users/Admin/myfile.txt')

5.19.8. Use Case - 0x03

  • Document Factory

  • Factory method

  • Could be used to implement Singleton

>>> class PDF:
...     pass
>>>
>>> class Docx:
...     pass
>>>
>>> class Document:
...     def __new__(cls, *args, **kwargs):
...         filename, extension = args[0].split('.')
...         if extension == 'pdf':
...             return PDF()
...         elif extension == 'docx':
...             return Docx()
>>>
>>>
>>> file1 = Document('myfile.pdf')
>>> file2 = Document('myfile.docx')
>>>
>>> print(file1)  
<__main__.PDF object at 0x...>
>>>
>>> print(file2)  
<__main__.Docx object at 0x...>

5.19.9. Use Case - 0x04

  • Document Factory

>>> class Docx:
...     pass
>>>
>>> class PDF:
...     pass
>>>
>>> class Document:
...     def __new__(cls, filename):
...         basename, extension = filename.split('.')
...         match extension:
...             case 'pdf':             return PDF()
...             case 'doc' | 'docx':    return Docx()
>>>
>>>
>>> file1 = Document('myfile.pdf')
>>> file2 = Document('myfile.docx')
>>> file3 = Document('myfile.doc')
>>>
>>> print(file1)  
<__main__.PDF object at 0x...>
>>>
>>> print(file2)  
<__main__.Docx object at 0x...>
>>>
>>> print(file3)  
<__main__.Docx object at 0x...>

5.19.10. Use Case - 0x05

  • Document Factory

>>> from abc import ABC, abstractmethod, abstractproperty
>>>
>>>
>>> class Document(ABC):
...     @abstractproperty
...     def EXTENSIONS(self) -> list[str]:
...         ...
...
...     @abstractmethod
...     def display(self):
...         ...
...
...     def __init__(self, filename):
...         self.filename = filename
...
...     def __str__(self):
...         return self.filename
...
...     def __new__(cls, filename):
...         extension = filename.split('.')[-1]
...         plugins = cls.__subclasses__()
...         for plugin in plugins:
...             if extension in plugin.EXTENSIONS:
...                 obj = object.__new__(plugin)
...                 obj.__init__(filename)
...                 return obj
...         else:
...             raise NotImplementedError('No plugin for this filetype')
>>>
>>>
>>> class PDF(Document):
...     EXTENSIONS = ['pdf']
...
...     def display(self):
...         print(f'Displaying pdf file {self.filename}')
>>>
>>>
>>> class Word(Document):
...     EXTENSIONS = ['docx', 'doc']
...
...     def display(self):
...         print(f'Displaying word file {self.filename}')
>>>
>>>
>>> class Txt(Document):
...     EXTENSIONS = ['txt']
...
...     def display(self):
...         print(f'Displaying txt file {self.filename}')
>>>
>>>
>>>
>>> file = Document('myfile.pdf')
>>> file.display()
Displaying pdf file myfile.pdf
>>>
>>>
>>> file = Document('myfile.doc')
>>> file.display()
Displaying word file myfile.doc
>>>
>>> file = Document('myfile.docx')
>>> file.display()
Displaying word file myfile.docx
>>>
>>> file = Document('myfile.txt')
>>> file.display()
Displaying txt file myfile.txt
>>>
>>> file = Document('myfile.odt')
Traceback (most recent call last):
NotImplementedError: No plugin for this filetype

5.19.11. Use Case - 0x06

>>> from datetime import datetime
>>> import logging
>>> from uuid import uuid4
>>> from abc import ABC, abstractmethod
>>>
>>>
>>> class BaseClass(ABC):
...     def __new__(cls, *args, **kwargs):
...         obj = object.__new__(cls)
...         obj._since = datetime.now()
...         obj._uuid = str(uuid4())
...         obj._logger = logging.getLogger(cls.__name__)
...         return obj
...
...     def _log(self, level: int, id: int, msg: str):
...         self._logger.log(level, f'[{level}:{id}] {msg}')
...
...     def _debug(self, id:int, msg:str):    self._log(logging.DEBUG, id, msg)
...     def _info(self, id:int, msg:str):     self._log(logging.INFO, id, msg)
...     def _warning(self, id:int, msg:str):  self._log(logging.WARNING, id, msg)
...     def _error(self, id:int, msg:str):    self._log(logging.ERROR, id, msg)
...     def _critical(self, id:int, msg:str): self._log(logging.CRITICAL, id, msg)
...
...     @abstractmethod
...     def __init__(self):
...         pass
>>>
>>>
>>> class Astronaut(BaseClass):
...     def __init__(self, *args, **kwargs):
...         ...
>>>
>>>
>>> astro = Astronaut()
>>>
>>> vars(astro)  
{'_since': datetime.datetime(1969, 7, 21, 2, 56, 15),
 '_uuid': '83cefe23-3491-4661-b1f4-3ca570feab0a',
 '_log': <Logger Astronaut (WARNING)>}
>>>
>>> astro._error(1234, 'An error occurred')  

5.19.12. References

1

Noufal Ibrahim. Python (and Python C API): __new__ versus __init__. Year: 2011. Retrieved: 2022-03-09. URL: https://stackoverflow.com/a/5143108

2

Basic customization. object.__new__(cls[, ...]). Year: 2022. Retrieved: 2022-04-01. URL: https://docs.python.org/3/reference/datamodel.html#object.__new__

3

object.__init__(self[, ...]). Year: 2022. Retrieved: 2022-04-01. URL: https://docs.python.org/3/reference/datamodel.html#object.__init__

5.19.13. Assignments

Code 5.45. Solution
"""
* Assignment: OOP ObjectConstructor Syntax
* Complexity: easy
* Lines of code: 6 lines
* Time: 5 min

English:
    1. Define class `Point` with methods:
        a. `__new__()` returning new `Point` class instances
        b. `__init__()` taking `x` and `y` and stores them as attributes
    2. Run doctests - all must succeed

Polish:
    1. Zdefiniuj klasę `Point` z metodami:
        a. `__new__()` zwraca nową instancję klasy `Point`
        b. `__init__()` przyjmuje `x` i `y` i zapisuje je jako atrybuty
    2. Uruchom doctesty - wszystkie muszą się powieść

Tests:
    >>> import sys; sys.tracebacklimit = 0
    >>> from inspect import isclass

    >>> assert isclass(Point)
    >>> assert hasattr(Point, '__new__')
    >>> assert hasattr(Point, '__init__')
    >>> pt = Point.__new__(Point)
    >>> assert type(pt) is Point
    >>> pt.__init__(1, 2)
    >>> assert pt.x == 1
    >>> assert pt.y == 2
"""

Code 5.46. Solution
"""
* Assignment: OOP ObjectConstructor Passwd
* Complexity: easy
* Lines of code: 21 lines
* Time: 13 min

English:
    TODO: English translation
    X. Run doctests - all must succeed

Polish:
    1. Iteruj po liniach w `DATA` i podziel linię po dwukropku
    2. Stwórz klasę `Account`, która zwraca instancje klas
       `UserAccount` lub `SystemAccount` w zależności od wartości pola UID
    3. User ID (UID) to trzecie pole, np.
       `root:x:0:0:root:/root:/bin/bash` to UID jest równy `0`
    4. Jeżeli UID jest:
       a. poniżej 1000, to konto jest systemowe (`SystemAccount`)
       b. 1000 lub więcej, to konto użytkownika (`UserAccount`)
    5. Stwórz metodę ``__repr__`` wyświetlającą informacje o klasie w formacie:
       `SystemAccount(username='root', uid=0)`
    6. Uruchom doctesty - wszystkie muszą się powieść

Hints:
    * `str.splitlines()`
    * `str.split()`
    * `str.strip()`

Tests:
    >>> import sys; sys.tracebacklimit = 0

    >>> list(result)  # doctest: +NORMALIZE_WHITESPACE
    [SystemAccount(username='root', uid=0),
     SystemAccount(username='bin', uid=1),
     SystemAccount(username='daemon', uid=2),
     SystemAccount(username='adm', uid=3),
     SystemAccount(username='shutdown', uid=6),
     SystemAccount(username='halt', uid=7),
     SystemAccount(username='nobody', uid=99),
     SystemAccount(username='sshd', uid=74),
     UserAccount(username='watney', uid=1000),
     UserAccount(username='lewis', uid=1001),
     UserAccount(username='martinez', uid=1002)]
"""

DATA = """root:x:0:0:root:/root:/bin/bash
bin:x:1:1:bin:/bin:/sbin/nologin
daemon:x:2:2:daemon:/sbin:/sbin/nologin
adm:x:3:4:adm:/var/adm:/sbin/nologin
shutdown:x:6:0:shutdown:/sbin:/sbin/shutdown
halt:x:7:0:halt:/sbin:/sbin/halt
nobody:x:99:99:Nobody:/:/sbin/nologin
sshd:x:74:74:Privilege-separated SSH:/var/empty/sshd:/sbin/nologin
watney:x:1000:1000:Mark Watney:/home/watney:/bin/bash
lewis:x:1001:1001:Melissa Lewis:/home/lewis:/bin/bash
martinez:x:1002:1002:Rick Martinez:/home/martinez:/bin/bash"""

# Parse DATA and convert to UserAccount or SystemAccount
# type: list[Account]
result = ...