Snakes in the Grass – Learn From My Python Fails

codeboomNovember 29, 2012January 6, 2013Computing, Teachingarray, case statements, do-while, equivalent code, fail, list, named tuple, pitfalls, procedural programming languages, python, record, repeat-until, select-case, snake, software, technology, version

As the January exams approach I have found myself becoming increasingly reflective about my teaching using Python. I have taught the OCR AS Level Computing course for two years now and when we first began the course Python was my immediate language of choice for its simplicity and ease of use. I did not have any experience with Python before I taught it, although obviously I had used many other procedural programming languages during my degree and I was easily able to pick it up. I enjoy using Python and I am glad I chose it, but this post intends to help teachers who are just starting out to avoid some of the pitfalls I have experienced.

Dude, where’s my construct?

The OCR AS syllabus requires students to be able to use select-case statements and do-while / repeat-until loops. Because Python is so compact and simple, it does not actually have these constructs! It is definitely possible to write the equivalent code in Python, but the key words are just not there, so you end up having to teach these concepts as pseudo code or in another language. This inevitably means that the students are less familiar with these constructs when they encounter them in the exam.

For info, here is how to do a select-case statement in Python:

# Select case pseudo code
number = input("Enter a number: ")
select( number ){
    case 0:
        print "You entered a zero"
        break
    case 1:
        print "You entered a one"
        break
    default:
        print "You did not enter a zero or a one"
        break
}

# The same effect in Python 
number = input("Enter a number: ")
if number == 0:
   print "You entered a zero"
elif number == 1:
   print "You entered a one"
else:
   print "You did not enter a zero or a one"

And here is how to achieve the same effect as a repeat-until loop:

# Pseudo code for a repeat until loop
counter = 0
repeat {
     print "Hello world"
     counter++
} until counter == 5;

# The same effect in Python
counter = 0
while( True ):
     print "Hello world"
     counter += 1
     if counter == 5:
          break

As you can see from the repeat-until example, Python also does not have the popular increment operator (++) either so if you wish to increment or decrement a variable you need to use += 1 as the shorthand way of doing it.

My advice: There is no “solution” to this problem – you either accept that you’ll have to teach these constructs as pseudo code, or you use another language to teach in.

Array issues

This one was a very silly decision indeed. I discovered that Python has a fantastic data structure called a list. You can do all sorts of useful things with a list – add items of data, iterate through the list, remove items of data, insert items of data, access them via their list index. Best of all, to create one the only thing you need is an identifier and some brackets – heaven! What idiot would want to go through all of the hassle of using the non-native proper Array construct where you have to faff with importing the library and putting in the type code (see this handy site) just to get one set up?

Well, as it turns out, a very sensible idiot.

You see, the key learning points about arrays in the exam are as follows:

To set one up you need to know its identifier, how many data items are inside it, and the data type of those items
Arrays can only hold data of one type (e.g. int, float etc.)
Arrays are a static (fixed size) data structure

However, lists are somewhat different:

To set up a list you need to know its identifier
Lists can hold data of multiple types at once – [“a”, 1, 6.887, True] is a perfectly decent list
Lists are a dynamic data structure and can be altered in size on the fly at any time

My advice: Whilst you can do absolutely everything you’d need to do with an array with a list as well, they have very different properties. In the exam, students will encounter questions based on the array learning points listed above which are more true for arrays in, say, Java. You try explaining to a class of students who already know how to do a list inside out that what they’ve learnt isn’t an array after all, and an array is slightly but very subtly different. Beware.

Y u so difficult?

The syllabus requires that you teach ‘record’ data structures, another thing which is native in languages such as Pascal but not in Python. (I keep mentioning native – I just mean that it’s part of the language without you having to do anything special to be able to use it.) It’s entirely possible to teach the same concept in Python using a structure called a “named tuple“, but it’s such a faff to set up and it doesn’t really feel as intuitive as the equivalent native type in Pascal (for instance). Add to this that I had no idea what one was until someone told me about it at the CAS working group meeting, so I’d imagine that many teachers new to programming will not quickly encounter this construct within their own learning either.

Edit: Adam McNicol (in the comments) says named tuples are no good for this. He recommends the following:

AQA use a class to simulate a record:

class student:
def __init__(self):
self.first_name = “”
self.last_name = “”

You could then use it:

new_student = student( )
new_student.first_name = “John”
new_student.last_name = “Bain”

My advice: Sitting on the fence here. Currently I’m teaching this as a concept using Pascal, but if I can get a simple explanation of named tuples going I will try to use that for more practical experience opportunities.

Version aversion

When I first started out, Python 2.7 was still rather widely used and so I created all of my resources including my full F452 student workbook using the syntax from 2.7. More recently, Python 3 has started to take over and this brings with it a number of headaches which are beautifully summed up in an appendix of the book “Invent your own Computer Games with Python“. The largest headaches are the fact that the print statement changes into a print function, i.e.

print "Hello world"

becomes…

print("Hello world")

… which sounds OK except that it gives a newline character at the end of the print by default, and whilst you used to be able to just put a comma afterwards to prevent that, you now have to provide an annoying second argument to the function – end=””. Also, the raw_input() function from Python 2.7 is now called input() in Python 3. Except that in Python 2.7 there was also a function called input() but that did something different. CONFUSED DOT COM! There are other syntax changes but these are the most annoying.

My advice: If you are just starting out, choose Python 3 and look for resources made in Python 3, it will save you work in the long run. I am going to be updating my booklet to Python 3 very soon.

Some are more equal than others

This is an extra bonus that I just thought of – the assignment and comparison operators. In Python you use = to assign a value to a variable and == to compare two values. However, in Pascal you use := to assign a value to a variable and = to compare two values. In the exam, pseudo code is used which could use EITHER WAY, because both Python and Pascal are accepted by OCR as teaching languages.

My advice: Continue to teach = and == but warn your students to look at the context clues in the question. They should be able to tell fairly easily from the code whether the = sign is being used as an assignment operator or a comparison operator. You may also need to warn them that && can denote “and” and || can denote “or” as well.

10 thoughts on “Snakes in the Grass – Learn From My Python Fails”

Andrew Hayward (@arhayward) says:

November 29, 2012 at 4:39 pm

What’s the point in that `break` statement in your “Hello world” loop? Would `while (counter < 5)` not suffice?

Reply
codeboom says:

November 29, 2012 at 4:42 pm

I’m emulating a repeat-until loop in Python. Obviously it’s silly – you’d just use a while loop. The point is that if you really wanted the same effect as do-while (i.e. check the condition after running the statements once) then that’s how you achieve it.

Reply
Richard says:

November 30, 2012 at 9:48 am

I don’t think I’ve ever used a switch statement in real life. Does that make me a bad person?

Reply
Pingback: Teaching Programming: Modern or Educational? | Academic Computing
John Stout (@stirlingstout) says:

November 30, 2012 at 10:37 pm

Weird isn’t it? I always myself on the defensive in VB when explaining that a = b is assignment if it’s just on its own but is a test for equality if it’s after anything else, so a = b = 0 doesn’t set both a and b to 0 but a to True or False when b is 0 or non-zero (and doesn’t change but at all).

I prefer the Pascal :=/= to either the VB =/= or Python’s =/==, since we’ve taught the students mathematics for longer than computing, and mathematics has always used = for ‘equals’. One year I gave my second year A-level students the a = 10: b = 20: a = b test and was astounded to get some a = 15 and b = 15 answers, which I’m sure had to do with the confusion.

Reply
Adam McNicol says:

December 1, 2012 at 4:15 pm

Nice post covering some of the issues with Python. With regards to a record named tuples are not a solution as tuples are immutable.

AQA use a class to simulate a record:

class student:
def __init__(self):
self.first_name = “”
self.last_name = “”

You could then use it:

new_student = student( )
new_student.first_name = “John”
new_student.last_name = “Bain”

It isn’t the nicest syntax but it actually is quite a nice way to introduce classes before OOP.

Adam.

Reply
1. codeboom says:
  
  December 1, 2012 at 4:18 pm
  
  Aha! Thanks for the suggestion, I’ll edit it in when I have time (on a mobile now). I haven’t actually used named tuples, it’s just that someone told me they could be used, as it says in the post. The new info is much appreciated!
  
  Reply
JimD says:

December 5, 2012 at 11:02 pm

Ophidia en herba.

Each of the items you describe here reveals a bias between some traditional coding techniques and the underlying computing concepts. For example the distinctions between an array and a list are largely artifacts of the implementation … a case where the details leak through the abstractions.

There is an array module in the Python standard libraries which imposes the constraints you describe and clarifies these distinctions. I would teach about the built in lists first using the single dimensional “array” subset of a list’s features indexing, slicing, iteration and aggregation, length, sum, etc. Then expand that to emulating a matrix (two dimensional array) as a list of lists and then use that as a point to introduce the array standard library module, the Numpy array and finish with the features of lists and list like objects (appending, pushing/popping, heterogeneous contents) which are distinct from classical array implementations and those features of the “array” class and Numpy arrays which are distinct from lists and from one another.

Another example is with the “case” or selection statements. Conditional statements can be viewed as a degenerate form of case (selection on truth or falsity) and could be implemented as a simple dispatch table with only two values. (Some Python code golfers us this sort of trick with lambdas, relying on the coercion of False/True into 0 or 1 … as an index, for example).

We, as human beings however, think of conditional statements and selections suites as entirely different. I think it’s reasonable to teach them as such and then use the discussion of case/switch statements in other programming languages as a sort of workaround that these languages implements when functions can’t be passed around as first class objects (necessary for dispatch tables) and before they supported classes with flexible calling semantics (which allow objects to handle the data on which we’d perform our switching).

This is a rather challenging topic to convey to new students. When we think of a cascade of enumerated conditions on some switch value, it can be initially rather difficult to consider how that can be reformulated as a lookup in a dictionary or as an index into a list such that the object returned, in either case, can be called on to execute our intended functionality. (That’s a dispatch table, of course, and imposes a constraint that all of our cases must be represented in terms of functions which could be called with matching argument signatures).

Taking that even further, to encapsulate the behavior into the object being used (through inheritance or composition) is a point I’d only barely mention and gloss over at first. Because it should be covered much later after classes, objects, polymorphism and object composition have been introduced.

For structs I would use simple objects (all attributes with no methods) to emulate them. Then named tuples can be explained as an optimization (much as arrays are an optimization over lists for cases where your intentions fit withing their constraints). Named tuples are basically an example of the “flyweight” (GoF) pattern. Lighter and less generalized than options, but with more convenient syntactic sugar than traditional tuples. I would also introduce the “struct” module but only as a way of clarifying the naming collision and as a foreshadow to discussions of object serialization/de-serialization and importing or exporting data to or from FFI (foreign functional interfaces).

Almost every programming language I’ve used makes the distinction between assignment (technically “binding” in a late binding, dynamic language such as Python) and comparison for equality. It’s also important to make the distinction between equality and identity comparisons (== vs. the ‘is’ operator) and useful (though potentially confusing) to mention a way in which the underlying implementation details can leak through the abstractions (for example when ‘is’ seems to success as == in cases where the underlying objects have been merged though “intern”-ing of small integers and short strings).

The Python distinctions between “and” and && (and “or” vs. || and “not” vs. ~) are relatively straightforward and easy to teach. They are far uglier in Perl and Ruby where these differ in precedence rather than fundamental types of their operands.

Overall I consider all of these to be opportunities to teach the important broader concept. Computer languages, at any point above raw, hand translated, binary machine code, are abstractions for us to use in expressing our intentions to one another (and to our future selves when we maintain our own code) and also to the machines which compile or interpret our code. Fundamentally to be experts in our field we must learn more than just the abstraction, we must also learn about ways in which varying underlying implementation details and hardware, OS, and other constraints “leak” through our layers of abstraction and affect our coding decisions.

One you didn’t cover, and which I certainly would, is the concept of tail recursion and the way in which straightforward Python doesn’t support transparent tail recursion elimination. This is, admittedly. an advanced topic, and recursion is notoriously confusing to first term CS students. However, it’s a concept that has been so pervasive for so long in the academics and in the professional discourse that a student it doesn’t “get it” by the second term or so is at a huge practical disadvantage.

Reply
1. codeboom says:
  
  January 6, 2013 at 11:02 am
  
  Sorry – I meant to reply to this comment a long time ago and then forgot all about it! Thank you for writing such a long and considered response.
  
  I’m teaching at high school level which means we have limited amounts of time and a wide variety in student ability. I don’t think there would be time to cover some of the things you mention above (for instance we don’t do lambdas) and they would definitely confuse a lot of the students. Some students have difficulties with understanding of very basic concepts such as the evaluation of conditions to a single value of either true or false, whilst some students understand this immediately. We also face the problem that the exam is written, so whilst students might be perfectly comfortable with implementing certain things, they are unable to express in words how the concept works and thus miss out on marks in the exam – very frustrating.
  
  I agree with most of your points but I wonder whether you are talking about teaching at undergraduate level rather than at high school?
  
  Reply
David Whale says:

May 16, 2013 at 8:30 am

I’ve come to the party a little late on this particular blog entry, only having at last made the time to read through your older posts, but I think I’ve still got a thing or two to add.

My preference is always to teach programming using a “problem/solution driven” approach – i.e. if you introduce some language construct such as a list or an array, then you should be teaching it as a solution to some problem. I believe that this approach works for all language constructs and programming problems. e.g. introduce loops as a solution to the problem of having to type in lots of repeated code (if you want 36 identical lines of output on the screen, don’t use 36 print statements, introduce a loop as a time-saving solution).

Thus, you could argue that at GCSE level, it doesn’t make any sense to introduce some constructs or features of a language, because they are not solving a problem that your student has at that point in time in their learning – why would you introduce the complexity of classes to a YR8 student who is only writing 1 page long programs with single instance concepts? The power provided by classes doesn’t solve any of their present problems, so they are unlikely to “get it”. Also there are lower level concepts such as functions that need to be taught before classes to gently lead into the concepts of abstraction.

Parallel to this though, there are other needs to meet that might introduce different reasons why a teacher might be forced to address some of these issues earlier than you might hope – the burning desire of children to write games, for example. A really good modern way to do this is to use pygame in python, or mcpi (minecraft pi) with it’s python programming interface. Now you are sucked quite heavily into classes, and forced to make the choice between explaining it all, or trying to brush over it. Sometimes brushing over it and providing a cookbook of common recipes for them to use just about works.

So, on the argument of missing language constructs – there are two (possibly opposing) driving forces behind your need to teach case statements and arrays. On the one hand, you want your children to learn programming and to solve problems by appropriate use of the correct language techniques. On the other hand, you have to get them through the exams, and to do this they need to “talk the talk” of the examinations board.

Probably the question to ask yourself at this point is, are you teaching them programming, or are you teaching them computer science?

Teaching programming gives short term gains, because they can see the fruits of their labour in the working programs that they create, and gives them a skill they could instantly apply to other areas. But you have to teach a specific language (e.g. python or pascal) to do this.

Teaching computer science gives longer term gains, it teaches them how to approach problem solving and then to apply that to any language. Computer science is enduring and will out-live any programming language. But the gains are longer term, and the children might not see the benefits in the short term. To understand the wider issues of computer science, you need to teach some practicalities to make it more concrete and bring the concepts to life.

The examinations bodies are modernising, but from what I see from their recent syllabus, some of the practicalities of their assessment are not scaling too well to modern day advancements in the practical side of programming – they accept python as a language, but have they altered their concepts to match the modern day language? Possibly not. In python you can call a pre-written function to do a sort for you. In an older language, you had to write the bubble sort from first principles. Which is the right way?

I personally think that both computer science and programming need to be taught side by side, a little of the practicalities, and a little of the theory to explain those. The practicalities provide a way to test the theories, and the theories give you the tool to reason about why things do or don’t work.

As for your missing constructs in python? The existing assessment system was set up based on older languages such as pascal(1968) and BASIC(1964) whereas modern languages use more modern and advanced techniques such as python(1991). Perhaps it is time for the assessment system to catch up a bit with modern day languages?

The challenge for you as a teacher of course, should you choose to accept it, is to find ways to teach the children useful and relevant skills in a way that they can pass their exams with good grades. I personally think that a balance between computer science, modern day best-practice, and practical programming is the way to go. And if you have any spare energy, work on the assessment organisations to make sure what they are testing is what is relevant in today’s modern society.

But, don’t take that journey alone. There are many practitioners out there very conscious of the need to protect the future of the industry, most of them are more than willing to help.

oh, and P.S. Don’t teach lambdas to your kids. Lambda calculus itself was an advanced topic in the final year of a compiler writing module on my degree course. It’s not something to teach to kids who are still working through the basics of program construction 😉

Reply