top of page

10 Interesting Python 3.9 Features



From the dictionary update/merge to the addition of new string methods to the introduction of zoneinfo library, a number of new features have been added. Furthermore, a new stable and high-performant parser has been introduced.


The standard library is updated with numerous new features along with the addition of new modules, zoneinfo and graphlib. A number of modules have been improved too, such as ast, asyncio, concurrent.futures, multiprocessing, xml amongst others.

This release has further stabilized the Python standard library.

Let’s explore Python 3.9 features now.


1. Feature: Dictionary Update And Merge Operators


Two new operators, | and |= have been added to the built-in dict class.

The | operator is used to merge dictionaries whereas the |= operator can be used to update the dictionaries.


PEP: 584


Code:


For Merge: |

>>> a = {‘farhad’: 1, 'blog’: 2, 'python’: 3}
>>> b = {’farhad’: 'malik’, 'topic’: 'python3.9’}
>>> a | b
{’blog’: 2, 'python’: 3, ’farhad’:’malik’,  'topic’: 'python3.9’}
>>> b | a
{’farhad’: 1,’blog’: 2, 'python’: 3, 'topic’:’python3.9’ }

For Update: |=

>>> a |= b
>>> a
{’blog’: 2, 'python’: 3,’farhad’:’malik’}

The key rule to remember is that if there are any key conflicts then the rightmost-value will be kept. It means that the last seen value always wins. This is the current behavior of other dict operations too.


Detailed Explanation:

As we can see above, the two new operators, | and |=, have been added to the built-in dict class.

The | operator is used to merge dictionaries whereas the |= operator can be used to update the dictionaries.


We can consider | as the + (concatenate) in lists and we can consider |= as the += operator (extend) in lists.


If we assess version 3.8, we will notice that there are few ways of merging and updating dictionaries.

As an instance, we can do first_dict.update(second_dict). The issue with this approach is that it modifies first_dict in place. One way to solve this issue is to copy the first_dict in a temporary variable and then perform an update. However, it adds extra unnecessary code just to get the update/merge to work.


We can also use {**first_dict, **second_dict}. The issue with this approach is that it is not easily discoverable and it’s harder to understand what the code is intended to perform. The other issue with this approach is that the mappings types are ignored and the type is always dict. As an instance, if the first_dict is a defaultdict and the second_dict is of type dict then it will fail.


Lastly, the collections library contains aChainMap function. It can take in two dictionaries such as ChainMap(first_dict, second_dict) and return a merged dictionary but again, this library is not commonly-known.


It also fails for subclasses of dict that have an incompatible __init__ method.


2. Feature: New Flexible High Performant PEG-Based Parser


The Python 3.9 version is proposing to replace the current LL(1) based Python parser with a new PEG-based parser which is high-performant and stable.


PEP: 617


Detailed Explanation:

The current CPython parser is LL(1) based. Subsequently, the grammar is LL(1) based to allow it to be parsed by the LL(1) parser. The LL(1) parser is a top-down parser. Furthermore, it parses the inputs from left to right. The current grammar is context-free grammar hence the context of the tokens is not taken into account.


The Python 3.9 version is proposing to replace it with a new PEG-based parser which means it will lift the current LL(1) grammar Python restrictions. Additionally, the current parser has been patched with a number of hacks that are going to be removed. As a result, it will reduce the maintenance cost in long run.


As an instance, although the LL(1) parsers and grammars are simple to implement, the restrictions do not allow them to express common constructs in a natural way to the language designer and the reader. The parser only looks at one token ahead to distinguish possibilities.


The choice operator | is ordered. For an instance, if the following rule is written:

rule: A|B|C

The LL(1) parser, a context-free grammar parser, will generate constructions such that given an input string will deduce whether it needs to expand A or B or C. The PEG parser is different. It will check if the first alternative succeeds. If it fails only then it will continue with the second or the third.


The PEG parser generates exactly one valid-tree for a string. Hence it’s not ambiguous like the LL(1) parser.


The PEG parser also directly generates the AST nodes for a rule via grammar actions. This means it avoids the generation of the intermediate steps.


The key point to take is that the PEG parser has been extensively tested and validated. The PEG parser performance is fine-tuned. As a result, for most instructions, it comes within 10% of the memory and speed consumption of the current parser. This is mainly because the intermediate syntax tree is not constructed.


I am eliminating the mention of the low-level details for the sake of keeping the article simple and readable. The link is provided at the bottom if more information is required to be understood.


3. Feature: New String Functions To Remove Prefix and Suffix

Two new functions have been added to the str object.

  • The first function removes the prefix. It is str.removeprefix(prefix).

  • The second function removes the suffix. It is str.removesuffix(suffix).


PEP: 616


Code:

'farhad_python'.removeprefix('farhad_')
#returns python'farhad_python'.removesuffix('_python')
#returns farhad


Detailed Explanation:

One of the common tasks in a data science application that involves manipulating text is to remove the prefix/suffix of the strings. Two new functions have been added to the str object. These functions can be used to remove unneeded prefix and suffix from a string.


The first function removes the prefix. It is str.removeprefix(prefix). The second function removes the suffix. It is str.removesuffix(suffix).


Remember string is a collection of characters and each character has an index in a string. We can use the indexes along with the colon : to return a subset of the string. This feature is known as slicing a string.


If we study the functions, internally they check if the string starts with a prefix (or ends with a suffix) and if it does then they return a string without a prefix (or after a suffix) using str[:] slicing feature.


With these functions being part of the standard library, we get an API that is consistent, less fragile, high performant and is more descriptive.


4. Feature: Type Hinting For Built-in Generic Types

Annotating programs have been made simpler in this release by removing the parallel type hierarchy in Python.


The release has enabled support for the generics syntax in all standard collections currently available in the typing module.


We can use the list or dict built-in collection types as generic types instead of using the typing.List or typing.Dict in the signature of our function.


Therefore, the code now looks cleaner and it has made it easier to understand/explain the code.


PEP: 585


Detailed Explanation:

Although Python is a dynamically typed language, the annotation of the types in the Python program enables introspection of the type. Subsequently, the annotation can be used for API generation of runtime type checking.


This release has enabled support for the generics syntax in all standard collections currently available in the typing module.


A generic type is usually a container e.g. list. It is a type that can be parameterized. Parameterized generic is an instance of a generic with the expected types for container elements e.g. list[str]


We can use the list or dict built-in collection types as generic types instead of using the typing.List or typing.Dict.


For example, we could guide the Python runtime type checking by annotating the code:

def print_value(input: str):
  print(input)
# We would get notified  if the input is not a string

Over the past few releases, a number of static typing features have been built incrementally on top of the existing Python runtime. Some of these features were constrained by existing syntax and runtime behavior. As a consequence, there was a duplicated collection hierarchy in the typing module due to generics.


For instance, we will see typing.List, typing.Dictionary along with built-in list, dictionary, and so on. This enables us to write code:

def read_blog_tags(tags: list[str]) -> None:
    for tag in tags:
        print("Tag Name", tag)


5. Feature: Support For IANA timezone In DateTime

The module zoneinfo has been created to support the IANA time zone database. This support for the IANA time zone database has been added to the standard library.


PEP: 615


IANA time zones are often called tz or zone info. There are a large number of IANA time zones with different search paths to specify the IANA timezone to a date-time object. As an instance, we can pass in the name of the search path as the Continent/City to a datetime object to set its tzinfo.

dt = datetime(2000, 01, 25, 01, tzinfo=ZoneInfo("Europe/London"))

If we pass in an invalid key then zoneinfo.ZoneInfoNotFoundError will be raised.


Detailed Explanation:

We use the datetime library to create a datetime object and specify its timezone by setting the tzinfo propety. However, we can end up creating complex timezone rules when using the datetime.tzinfo baseline.


Most of time, we only need to set the object and set its timezone to either UTC, system local time zone, or IANA time zone.


We can create a zoneinfo.ZoneInfo(key) object where the key is of type string indicating the search path of the zone file in the system time zone database. The zoneinfo.ZoneInfo(key) object can be created and set as the tzinfo property of the datetime object.


Code:

from zoneinfo import ZoneInfo
from datetime import datetime

dt = datetime(2000, 01, 25, 01, tzinfo=ZoneInfo("America/Los_Angeles"))


6. Feature: Ability To Cancel Concurrent Futures

A new parameter cancel_futures have been added to the concurrent.futures.Executor.shutdown().

This parameter cancels all of the pending futures that have not started. Prior to version 3.9, the process would wait for them to complete before shutting down the executor.


Explanation:

The new parameter cancel_futures have been added to both ThreadPoolExecutor and ProcessPoolExecutor. The way it works is when the value of the parameter is True then all pending futures would be canceled when the shutdown() function is called.


In a nutshell, when the shutdown() is executed, the interpreter checks if the executor is not garbage collected. If it is still in memory then it gets all of the pending worker items and then cancels the futures.

Once there are no pending work items then it shuts down the worker.


7. Feature: AsyncIO and multiprocessing Improvements

A number of improvements have been made to the asyncio and multiprocessing library in this release.

As an instance,

  1. The reuse_address parameter of asyncio.loop.create_datagram_endpoint() is no longer supported due to significant security concerns.

  2. New coroutines, shutdown_default_executor() and coroutine asyncio.to_thread() have been added. The shutdown_default_executor schedules a shutdown for the default executor that waits on the ThreadPoolExecutor to finish closing. The asyncio.to_thread() is mainly used for running IO-bound functions in a separate thread to avoid blocking the event loop.

With regards to the multiprocessing library improvements, a new method close() has been added to the multiprocessing.SimpleQueue class.


This method explicitly closes the queue. This will ensure that the queue is closed and does not stay around for longer than expected. The key to remember is that the methods get(), put(), empty() must not be called once the queue is closed.


8. Feature: Consistent Package Import Errors

The main issue with importing Python libraries prior to the 3.9 release was the inconsistent import behavior in Python when the relative import went past its top-level package.


The builtins.__import__() raises ValueError while importlib.__import__() raises ImportError.

It has been fixed now. The __import__() now raises ImportError instead of ValueError.


9. Feature: Random Bytes Generation

Another feature that has been added in the 3.9 release is the function random.Random.randbytes(). This function can be used to generate random bytes.


We can generate random numbers but what if we needed to generate random bytes? Prior to 3.9 version, the developers had to get creative to generate the random bytes. Although we can use os.getrandom(), os.urandom() or secrets.token_bytes() but we can’t generate pseudo-random patterns.


As an instance, to ensure the random numbers are generated with expected behaviour and the process is reproducible, we generally use the seed with random.Random module.


As a result, random.Random.randbytes() method has been introduced. It can generate random bytes in a controlled manner too.


10. Feature: String Replace Function Fix

Prior to the Python version 3.9, the “”.replace(“”,s,n) returned empty string instead of s for all non-zero n.

This bug confused the users and caused inconsistent behavior in applications.


The 3.9 release has fixed this issue and it is now consistent with "".replace("", s).


The way replace function works is that for a given max replace occurrence argument, it replaces a set of characters from the string by a new set of characters.

string.replace(s, old, new[, maxreplace])Return a copy of string s with all occurrences of substring old replaced by new. If the optional argument maxreplace is given, the first maxreplace occurrences are replaced.

To further explain the issue, prior to the version 3.9, the replace function had inconsistent behaviour

"".replace("", "blog", 1)
Returns ’'
One would expect to see blog"".replace("", "|", 1)
Returns ’'
One would expect to see |"".replace("", "prefix")
Howver returns ’prefix'

Therefore the change now is that if we pass in:

“”.replace(“”,s,n) returns s instead of empty string for all non-zero n

A number of redundant features have been eliminated in the Python 3.9 too, such as Py_UNICODE_MATCH.



Source: Medium


The Tech Platform

0 comments

Recent Posts

See All
bottom of page