Python For Scientists and Engineers

If you are looking to expand your skills beyond LabVIEW, Python is a good choice. I recently took Enthought's class. Here are my thoughts.

Python For Scientists and Engineers

Last week I took a Python Class from Enthought. It was appropriately called "Python For Scientists and Engineers". If you are looking to branch out from LabVIEW at all, I highly recommend this course. I enjoyed it. I learned a lot. The material was great and the teacher Glen was really knowledgeable, good at explaining things, patient, and most of all very enthusiastic. It was a five-day online course and they were full eight-hour days. Teaching half-day Zoom courses for NI, I am exhausted at the end of each half-day. I don't know how Glen did it for five full days. It was hard enough as a student.

Topics

It was a 5-day course. Over the 5 days, we covered a variety of topics. For my taste, I thought it was a little too focused on plotting and not enough on the software engineering side. I can't argue with that emphasis since the title is "Python For Scientists and Engineers". It doesn't say anything about software engineering and they also do have a separate  "Software Engineering in Python" which I'll have to take at some point. I assume scientists are very interested in the intricacies of plotting data. For me, I just need to be pointed in the right direction regarding how to plot data and I can figure the rest out. This class seemed to go into way too much detail about plotting for me. That didn't ruin the class for me, because I still learned a lot.

The Basics

The first day or so was aimed at the basics of Python. Most of this was just a review for me, although reviewing list and dictionary comprehensions is always useful. I also learned a few tricks like taking 2 lists and doing dict(zip(list1, list2)) and also realizing that instead of typing out things like ['a','b','c'] you can just do list('abc').  I'm still trying to figure out how to write efficient and concise Python code that is still readable. It is possible to do so much in a single line of Python code, yet that level of conciseness can make things very hard to read. It seems like a real balancing act.

Ipython

Somehow I had never discovered IPython. You may know about Pythons REPL (Read  Evaluate Print Loop). You can type in python at the command prompt and you get a prompt with >>> It lets you enter various commands and see the output. For LabVIEW, it is the equivalent of creating a new blank vi and dropping some nodes in there just to see what they do. IPython is that idea on steroids.

Ipython has got a bunch of magic commands for things like running scripts, exploring the environment, and accessing previous values. It has really good help (just type ? after a command or module to see the docstring or ?? to see the source). It has really useful tab completion that lets you browse all the available methods on an object. You can also run shell commands in IPython by prefacing them with a ! . This is really useful because you can run git commands right from IPython. I also discovered IPython integrates well with Pycharm. If you pip install ipython in your virtual environment, the Python interpreter tab will use that instead of the normal REPL.

Jupyter Labs

Speaking of IDEs, the class uses Jupyter Labs, which I had not previously used. I had played around with Jupyter Notebooks a little, but that was it. Jupyter Labs is a browser-based IDE. It's alright. I still prefer Pycharm. Jupyter Labs didn't seem to have any built-in refactoring tools, which was a big negative in my mind. It also didn't seem to have a quick and easy way to run tests. I would have to flip from the text editor to the IPython terminal and run !pytest myfile.py every time. It also didn't automatically save edits, which meant that I had to remember to hit ctl+s before I switched tabs. PyCharm autosaves, so I just type in my change and hit shift F10 and it immediately runs my tests.

Jupyter Notebooks

Jupyter Notebooks themselves are worth playing around with. They let you intersperse your code with chunks of markdown text and graphs and plots. For typical LabVIEW Test and Measurement applications, maybe not as useful, but for laboratory types doing experiments and wanting to record what they've done and explain it to and share it with others, it could be quite useful.

Object Oriented Programming

We spent a 1/2 day on Object Oriented Programming (OOP). It's kind of hard to talk about Python without mentioning OOP since everything is an object. The teacher did a good job. OOP is a complicated subject and you can only cover so much in half a day. It really was just the basics of OOP. As someone who does a lot of OOP programming in LabVIEW, it was slightly disappointing, but the teacher did the best he could in a short amount of time. We did cover more OOP topics later when we discussed Traits.

Matplotlib

Next, we talked about MatPlotLib. This is the standard way to do graphs in Python. It reminded me of plotting in MatLab - the syntax is identical. I hadn't done much of that since I finished college over a decade ago, but it was like riding a bike. This is one area where LabVIEW really shines. The WYSIWYG nature of LabVIEW makes it much easier to adjust plot settings as opposed to running your Python code, looking at the plot, wanting to change something, having to figure out which variable to adjust, and THEN rerunning your code to check the results. Matplotlib seemed very inefficient in that regard.

The one nice thing I will say about MatPlotlib is that having the plot settings spelled out makes them much easier to diff. It's much easier to see what changed. In LabVIEW, it just tells you the graph properties changed as if that information alone is supposed to be useful. I guess you could get something similar in LabVIEW if you used property nodes and dropped constants. Maybe I'll get in the habit of doing that going forward. There's something in the Zen of Python about explicit being preferable over implicit.

NumPy

Next, we talked about NumPy, which is probably one of the more popular Python packages. I liked it because it got me back closer to LabVIEW world due to its typing. Native Python lists can take any mix of data types, which I always found confusing. Numpy arrays each have a specific type. Also in Python ints can go up to basically infinity (limited by the amount of RAM). In Numpy you have the standard I32, U32, etc. Another difference is the way memory and copy are handled. Numpy is much more careful about that. One last difference is Numpy allows for per-item mathematical operations. As an example np.array([1,2,3])*3 will output array([3,6,9])Whereas [1,2,3]*3 will output [1,2,3,1,2,3,1,2,3]

Pandas

Next, we talked about Pandas. Pandas is a data analysis toolkit that is super popular in the Python world. I kind of think of it as an alternative to SQL. It allows you to put data into tables and then manipulate it to calculate sums and means, etc. I guess one advantage over SQL is that you can run arbitrary Python functions on the data as opposed to just the built-in SQL ones. Also apparently Pandas has the ability to import data in a wide variety of formats and has vast export capabilities as well.

For my part, I spent a bunch of time learning SQL last year. So my go-to is to import data into SQLite using SQLiteBrowser. It's a free tool, although you should support them. It allows you to import CSV files and it has a window to run queries and do some very basic plotting. Also for SQLite, whereas LabVIEW requires a separate toolkit, it is built into Python.  The class didn't cover that, I just discovered that on my own. But if you are interested in SQLite in Python, here is all that is required:

import sqlite3 

with connection = sqlite3.connect('mydb.db'):
     results = connection.query('SELECT * FROM mytable')

So for now, I think I'll stick with SQLite, but if decide I need the power of Pandas, I at least know where to start.

Traits

One of the biggest complaints I hear from LabVIEW Developers using Python is the typing. Chris Stryker has complained to me about this. When you have a Python function, it can be hard to tell what data it returns or even what data it expects. Traits is Enthought's way of addressing that issue. It's a separate framework/toolkit that you can install. It lets you add traits to an object. It's kind of an alternative to the built-in Python properties. Traits provides similar functionality with a lot less coding on the developer's part and it adds some type safety (at run-time).

from Traits import HasTraits, Float

class Particle(HasTraits)
    x_position=Float(0)
    y_position=Float(0)
    
p1=Particle()
p1.x_position # returns 0
p2=Particle(x_position=1,y_position=2)
p2.x_position # returns 1
p2.x_position='5' # returns a type error

Traits also has the ability to add what it calls properties which are calculated. There's a lot there. It is open source, so you can find the repository and take a look at it. in IPython you can just import the Traits module and use Traits? to explore.

Traits UI

One of the advantages of using Traits is that Enthought has builtin a simple GUI editor that lets you manipulate the values of an object. You don't have to use it, but you get it for free. It's not necessarily anything to write home about if you are used to LabVIEW, but it can be nice. You can also build on it with things like Chaco.

Chaco

Chace is a plotting platform similar to MatPlotLib. Enthought built an extension onto Traits UI to embed a plot into an object that HasTraits. Then when you show the TraitsUI it will display a graph. This led to a lot of examples and demos where you ended up with a GUI window that displayed a graph along with some parameters that you could manipulate that would update the graph in real-time. Again coming from a LabVIEW background, not super-impressive, since you get a lot of this functionality built-in for free in LabVIEW, but it is still useful. For a text-based language, Chaco and Traits UI is a relatively quick and painless way to get a simple interactive graph up and running.

Software Craftsmanship

This portion of the class starts off with the distinction between Interactive mode and Production mode. Interactive mode is "I'm just screwing around to see what's possible." or "I'm just doing a one-off throwaway code thing where I'm just trying to solve one specific problem." versus "I'm writing code that needs to be maintainable." I think that mindset shift is very important. I often see a lot of LabVIEW developers who don't make that distinction.

In this section, we covered documentation and basic "Clean Code" ideas. We covered basic unit tests. This section also covered debugging and profiling. In my previous Python experience I hadn't gotten to that yet, so that was useful. We also covered logging, which I was aware of but hadn't really used. We covered refactoring although not in the context of using tests to verify you weren't changing anything. That's alright though because it gave me a chance to try out approval testing. It wasn't covered by the class, but it was something I was aware of and wanted to play around with, so I used it in the exercise. It worked very well. I think reformatting scientific scripts where you take in a datafile and transform it into some other format is a perfect use case for approval testing. It was very straightforward to use and worked very well.

The class also covered Flake8 which is the standard Python linter, which is equivalent to VI Analyzer. I discovered on my own (through the GLA Summit Python Roundtable) an autoformatter called Black, which is very useful. It doesn't quite replace Flake8 though. Flake8 will catch syntax errors and unused imports, whereas Black does not. One nice thing I discovered about Black is that you can connect it to PyCharm and have it automatically reformat your code on save. Very Useful.

Extensive Resources

The last thing I want to say about the course is that they provide you with a lot of resources. You get a copy of the slide deck to refer back to. It is huge. There are way more slides than we covered in class. There is a whole appendix at the end with some additional information. There is also a ton of exercises, many of which we did not have time to get to in class. There is also a bunch of demo code as well. I'm sure I'll be referring to the slide deck and maybe doing some of the exercises we skipped as well.