3 min read

ETL - Read .py from different sources

1 Introduction

Looking back, we have already covered an incredible number of data science topics.

We have dealt with a wide range of “topics in the field of machine learning”. And furthermore, how these algorithms can be applied in practice. A rule of thumb says that a data scientist spends 80% of his time on data preparation. The same amount of code is generated at this point. If you are working on a customer project that is only interested in the results, the notebook in which you are working, for example, is quickly overcrowded with syntax that only refers to the preparation of the data. For such a case it is a good idea to write an ETL-script. ETL stands for extract, transform and load. In python you have the possibility (I prefer Microsoft Visual Studio) to create a python file (.py). In this post I want to introduce how to call such python files from different sources and get their different functions. In the following publications I will present different types of ETL variations.

2 The Setup

My actual setup looks a little different, but we will come back later. Here I created a project folder where I put one python script in folder_1, another one in folder_2 and a third one under notebooks.

  • Step 1: Navigate to the notebooks folder
  • Step 2: Start the jupyter notebook from this point

From here I can call the python scripts as follows:

import sys

# Specifies the file path where the first .py file is located.
sys.path.insert(1, '../folder1')
import py_script_source_1 as source1

# Specifies the file path where the second .py file is located.
sys.path.insert(1, '../folder2')
import py_script_source_2 as source2

3 Run the python scripts

Run script 1:

# Run function from py_script_source_1 file
source1.happyBirthdayDaniel()

Run script 2:

# Run function from py_script_source_2 file
source2.greetingsDaniel()

‘Normal’ libraries can still be imported as usual.

import pandas 
import numpy

Even .py files which are located in the same folder as the .jpynb script can be called without specifying another path.

import py_script_source_3 as source3

Let’s try this script, too.

# Run function from py_script_source_3 file
source3.thanks_for_reading()

4 Content of the python scripts

As we could see, the methods we used were not breathtaking. They were only used for illustrative purposes at this point.

But here is an overview of their exact contents:

Script 1:

def happyBirthdayDaniel(): #program does nothing as written
    print("Happy Birthday to you!")
    print("Happy Birthday to you!")
    print("Happy Birthday, dear Daniel.")
    print("Happy Birthday to you!")

Script 2:

def greetingsDaniel(): #program does nothing as written
    print("Thanks for your visit.")
    print("Thanks for being there.")
    print("Nice to have seen you again.")
    print("See you soon!")

Script 3:

def thanks_for_reading(): #program does nothing as written
    print("Thank you for reading this article.!")

5 Conclusion

In this article I have shown exemplary how to call python scripts from different locations. As already announced I will talk about different variaions of ETLs in the following posts. Keep reading.