
Data Scientist’s Guide to Efficient Coding in Python
We are giving real-life coding scenarios where we have actually used them!
1. Use tqdm when working with for loops.
Imagine looping over a large iterable (list, dictionary, tuple, set), and not knowing whether the code has finished running! Bummer, right! In such scenarios make sure to use tqdm construct to display a progress bar alongside.
For instance, to display the progress as I read through all the files present in 44 different directories (whose paths I have already stored in a list called fpaths):
from tqdm import tqdmfiles = list()
fpaths = ["dir1/subdir1", "dir2/subdir3", ......]
for fpath in tqdm(fpaths, desc="Looping over fpaths")):
files.extend(os.listdir(fpath))

Using tqdm with “for“ loop
Note: Use the desc argument to specify a small description for the loop.
2. Use type hinting when writing functions.
In simple terms, it means explicitly stating the type of all the arguments in your Python function definition.
I wish there were specific use cases I could provide to emphasize when I use type hinting for my work, but the truth is, I use them more often than not.
Here’s a hypothetical example of a function update_df(). It updates a given data frame by appending a row containing useful information from a simulation run — such as classifier used, accuracy scored, train-test split size, and additional remarks for that particular run.
def update_df(df: pd.DataFrame,
clf: str,
acc: float,
remarks: List[str] = []
split:float = 0.5) -> pd.DataFrame:
new_row = {'Classifier'<