Blog

In defense of Matlab

https://commons.wikimedia.org/wiki/File:Heart-Drawn_Using_MATLAB.svg

https://commons.wikimedia.org/wiki/File:Heart-Drawn_Using_MATLAB.svg

I recently came across this article by Olivia Guest on neuroplausible talking about how programming is taught in psychology, specifically how Matlab (to put it politely) is the wrong tool for this purpose. Originally, I was reluctant to comment on the article. The author clearly states that this article is intended for people programming in the psychology world, but there's a number of criticisms of Matlab that I both agree and disagree with that are independent of the application. And since I've worked pretty extensively with Matlab, and with it being my first real exposure to programming (although I'd briefly worked with other languages prior), I feel like I wouldn't be a complete outsider to comment on these points.

As a disclosure though, I've had the privilege of attending Mathworks (the makers of Matlab) events where I've had pretty direct access to many of the developers and even executives. This has generally left me with an overall more positive view of the toolset, but I'm by no means convinced of either its being the appropriate or best tool for every job.

Having said that though, let's look at some of the arguments that I most agree or disagree with: 

... I posit that Matlab knowledge can make it harder than absolutely no programming knowledge for us to shift to another language. Matlab has an IDE that provides GUI functionality that allows us to edit variables dynamically like in Excel, which we know causes demonstrable problems. It causes some of our students to think that the Matlab IDE is what programming is, in much the same way some of our students think SPSS is what statistics is. Furthermore, high dependence on manually editing things is extremely bad because our workflow will not be reproducible nor replicable.

I agree, tasks in any programming environment should be automated as much as is reasonable, since the entire point of programming is to automate tasks. I'm of two minds about this point though. On one hand I can see this from the point of view of the "bad training wheels" idea, that this temptation to abuse certain features of the IDE could be avoided by using a language that does not rely on an IDE in the first place. However, I also can't avoid asking why those teaching programming to these students (and this where I'm out of my element as an engineer, not a psychologist) don't structure the material in such a way to weed out these habits. Any language is prone to its own abuses by lazy or rushed programmers, but is this a problem of language, or pedagogy?

To put this another way, when one is learning to drive they do not tend to learn to drive using an automatic gearbox. They learn to drive with a manual gearbox and it is tough. Learning the harder of the two types, manual, allows us to then easily transfer to the easier of the two if need be. In the case of USAmericans, they mostly learn to drive an automatic gearbox and almost never learn manual (because their skills do not transfer easily). Although the metaphor is simplistic, it suffices to explain why Matlab is not the best language to learn, it is a car with an automatic gearbox. We cannot easily transfer what we have learned to driving stick and in fact licences for just automatic transmission exist in my home country and the UK: if you learn just automatic you cannot be expected to know stick, whereas if you learn manual transmission you know “everything”.

Another 50/50 for me. I agree that learning manual is the superset of skills required to drive a car. But it doesn't answer the question of whether learning to drive manual, even if it's useful in certain situations, is really necessary. As a USAmerican, I grew up only driving automatic, until a few years ago I went out of my way to learn manual since a certain car I wanted only came in a manual version. Do I enjoy driving manual, do I feel like it gives me more control of the driving experience? Absolutely. Do I think that everyone should know how to drive manual when most of what they want to do is drive to and from their various daily chores? Probably not. And while it's perhaps stretching the analogy too far, I should mention that many automatics now have better 0-60 times (or 0-100 if we're being metric) and better fuel economy than manuals exactly because the systems have been optimized beyond what plodding humans are capable in terms of shift times and minding their RPMs. Like an automatic, I think Matlab does hide a lot of the more painful elements of programming and that can make it awkward to move to another language. But having programmed in Python, R, Julia, and a few others, many with similar IDEs, it's not something I've really struggled with. And also like an automatic, I think Matlab, from what I've seen at Mathworks, is being constantly honed to be faster and more efficient on the backend, without having to demand more expertise from the programmer. And I have to say, I don't think that's a bad thing.

Matlab puts a ceiling on what kinds of projects we can do both in size and in scope. Optimising for hardware, needing to lower space and time complexity, wanting something very specific like web-scraping, etc., are all tougher within Matlab. This is because Matlab is more a domain-specific than a domain-general language, it is centrally controlled, and the GUI and IDE cannot cope with large projects easily (although there is a command line mode, which we will be predominantly uncomfortable with given we only know Matlab).

As I have moved into other programming languages, the domain-specific nature (and limitations) of Matlab has become a lot more obvious. And I've run into cases myself where certain projects have been held up by licensing concerns or proprietary toolboxes, which is clearly not a problem in the open source world. And I completely agree that this is frustrating and counterproductive. While Mathworks is trying to get ahead of emerging trends like data science and machine learning, it's also true that many of the most powerful tools out there now are open source. Can they stay relevant in an era of TensorFlow and pandas? Moreover, can they get far enough ahead of those tools to justify the price tag?

Perhaps most flagrantly, arrays in Matlab start at 1. One has no idea how maladaptive this is until they move outside Matlab. Computer science starts from zero for a reason. If we want to learn generalisable skills, learning that indexing starts at 1 will hinder us, perhaps even cause us to introduce very nasty hard-to-find bugs when we move outside the Matlab ecosystem. All these put together cause us to get more confused by new languages as the baggage we carry with us from learning Matlab needs to be actively unlearned and inhibited.

This is a very odd criticism to me. Yes, many programming languages are zero indexed, but several of the languages listed in the article as alternatives to Matlab, namely R and Julia, are also one indexed. It seems to stem from mathematicians wanting one indexing and computer scientists feeling that zero indexing is more appropriate. To me this is a minor issue, the same way you have to remember how scoping and argument passing differs between languages.

Secondly, Matlab is closed source, proprietary, and prohibitively expensive if you have to buy it yourself. They obfuscate their source code in many cases, meaning bugs are much harder to spot and impossible to edit ourselves without risking court action. Moreover, using Matlab for science results in paywalling our code. We are by definition making our computational science closed.

YES, all of this, except maybe the court action. There have been so many times when something that seemed like it should work didn't, and I had no way besides contacting Mathworks to figure out why. And there were also many times when I wanted to know how a part of the code worked to better explain to people where numbers were coming from and was unable to access that part of the toolset. Without quoting several paragraphs that follow this one, Dr. Guest makes the excellent point that being closed source/proprietary poses significant ethical problems for conducting science with Matlab. How can code be shared and independently verified given the high price tag for the tools involved to run it? Does this not impose a paywall on any science conducted with Matlab?

In conclusion, I still disagree with Dr. Guest that Matlab is a scourge on the programming world. I think it has certain niches where it's the best choice and that many (but not all) of the problems with the language might be addressed on the educational side, although this point is really out of my area of expertise. There are also other tools in the Matlab universe, like Simulink, which are beyond the scope of this article, but are nonetheless extremely valuable to engineers. Certainly Simulink is prone to abuse and not everyone is a fan, but I have yet to hear of a tool or programming language that invariably, or even mostly, produces high quality, readable code or diagrams. And as someone who mostly started in Matlab and is now branching out, I don't feel like I've been particularly stunted in my growth, although I will agree that I've found I have a lot to learn.

 That's not to say though that I think Mathworks is on the right path. This article is one of a number of times I've heard Matlab's closed source nature pointed out as a reason to avoid it. Can Matlab survive in a world that's increasingly open source? Without the community development that's propelled R and Python (among others), can Matlab keep up? And on even a more basic level, is it ethical to promote Matlab's use in science, knowing that whatever code is generated will live behind a paywall? Personally, I suspect Matlab would solve many problems if at least the basic language/packages were made open source. Large corporations still need support contracts, and it would eliminate the constant headache of adjusting licenses based on certain ones running out or being unused. Would it be in Mathworks' best short term interest? I have no idea, probably not. But with new languages like Julia, and ascendant older ones like Python and R, what is their best long term play? It's a question I hope they're asking, and one whose answer ends up benefitting everyone.